: keep-alive

HTTP/1.1 200
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, OPTIONS
Access-Control-Allow-Headers: Content-Type

data: {"message_type": "asking_sites", "message": "Asking Iunera", "query_id": ""}

data: {"message_type": "result_batch", "results": [{"url": "https://www.iunera.com/kraken/fabric/multi-dimensional-time-series-analysis-olap/", "name": "Multidimensional Time Series Analysis VS OLAP", "site": "iunera", "siteUrl": "iunera", "score": 60, "description": "This article provides an in-depth overview of multi-dimensional Time Series Analysis and its connection to OLAP methods, focusing on data preparation and analytical operations such as Slice, Dice, Pivot, Roll-Up, and Drill-Down. It is relevant because it covers foundational concepts and practical examples useful for understanding and applying multi-dimensional analysis techniques in time series data contexts, despite the absence of a specific question.", "schema_object": {"@context": "https://schema.org", "@type": "Article", "headline": "Multidimensional Time Series Analysis VS OLAP", "description": "80% of Data Science work is information preparation. We describe the foundations of multi-dimensional Time Series Analysis from Data Warehousing OLAP.", "articleBody": "Multi-dimensional Time Series Analysis and OLAP methods are important when working with Time Series Data.  \n\n\n\nOften multi-dimensional Time Series Analysis (as the term is referred to) is a complete set of methods in applying machine learning to create forecasts or search for anomalies and patterns. Common multi-dimensional analysis operations get applied in Business Intelligence and Data Warehousing where they are often called Online AnaLytical Processing (OLAP) operations [1].\n\n\n\nKnowing these multi-dimensional Time Series Analysis foundations is essential, because at least 80% of Data Science work is Big Data and Landscape preparation. \n\n\n\nIn this article, we focus on good old multi-dimensional Time Series Analysis foundations to prepare, investigate and aggregate the Time Series Data in a deterministic way. We also discuss and describe what the most important multi-dimensional Time Series Analysis and OLAP methods are and show examples of how the different operations are applied on a Time Series Data sets.\n\n\n\nTime and viewpoint of the same object reveal different insights. We see the city of Matera and how a tower, the city and its insights look very different at different times of the day and from varying perspectives. Time series analytics and OLAP are pretty much the same concepts. With multi-dimensional Time Series Analysis operations, you can change the perspective and gain new insights easily. \n\n\n\n\t\t\t\n\t\t\t\tTable of Contents\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\t\n\t\t\t\tWhy Data Warehouse OLAP VS Time Series Analysis?Why do we need multi-dimensional Time Series OLAP operations?Multi-dimensional Time Series Data foundationsHow to Slice Time Series DataDice Time Series Data in a subcubeAdvanced OperationsSplit, Merge, PivotRoll-Up and Drill-DownTypical applications of multi-dimensional Time Series Analysis operationsSum Up FAQSummary and conclusionUpdate (29th April 2021):\u00a0Check out our latest project, Fahrbar!Related Posts\n\t\t\t\n\t\t\n\n\nWhy Data Warehouse OLAP VS Time Series Analysis?\n\n\n\nAnalytical multi-dimensional OLAP operations matter in the Big Data area, because Time Series Data can be processed and analyzed in the same way like data from a Data Warehouse. Multi-dimensional Time Series OLAP based Analysis is essentially applicable in Data Warehouses, which are classically filled with data out of operative enterprise systems.\n\n\n\nTechnologically, the multi-dimensional Time Series Analysis methods from Data Warehouses help to aggregate and generate subsets of Time Series Data in an efficient way. \n\n\n\nThese multi-dimensional analysis foundations from  Data Warehouses are needed by Data Scientists to assess  Time Series Data quality and to prepare the data then to apply the ultimately fancy Time Series Data.   \n\n\n\nOut of the pure volume, rare data deletion, other data sources and complexity, Big Data landscapes are realized with different technology stacks. \n\n\n\nData Scientist  and  Big Data Engineers  combine classically multiple open source systems and leverages functionality in similar to a  Data Warehouse in  Big Data landscapes. \n\n\n\nTime Series Databases often offer support for a subset of the complete analytical multi-dimensional operations which are available in  Data Warehouses. Similarly, a  Data Scientist can code and apply the operations by hand when investigating  Time Series Data sets.\n\n\n\nHence, it helps to know the origin of the operations and how they work in order to apply them manually on Big Data or to leverage the right Time Series Database for this matter.\n\n\n\nWhy do we need multi-dimensional Time Series OLAP operations?\n\n\n\nJust imagine the world of an executive:\n\n\n\nDozens of plants, hundreds of salesmen, thousands of employees and millions of sold products. How do you push the company in the right direction? Who are our star salesmen? Which are the most profitable products? Where and which products get sold the most? \n\n\n\nThese questions are asked by executives in order to steer a company and effective means to analyze the data in this way are needed. Therefore, one needs a simple way to compute different perspectives of the same data. Furthermore, one needs to view and get insights into specific subsets of the same data. \n\n\n\nData Scientists face the same issue as they need to determine which fraction of the Time Series Data they use to train AIs or to do descriptive Time Series Data analysis or investigations. Thus, they need ways to have data available in a pre-processed format to get the necessary perspectives easily when it is required, technology-wise or enterprise-wise.\n\n\n\nMulti-dimensional Time Series Data foundations\n\n\n\nIn the following, we use this sample dataset to help explain how multi-dimensional Time Series Data Analysis operations can be applied. We see that Date, Quarter and Year refer to Time Series Data and time intervals. \n\n\n\nA fact table for Time Series Analysis, connecting time points and intervals to data dimensions (Country, Address, etc.) and measures (Revenues).\n\n\n\nWe show a schematic multidimensional arrangement of this data in the following graphic. The multidimensional arrangement is typically called cube or Hypercube in Data Warehousing.\n\n\n\nA multidimensional visualisation in form of a cube from a fact table\n\n\n\nWe see that the different columns are mapped onto dimensions and the revenues are represented by the measure (M) in the middle. Hence, the values of the columns link the indicator or revenues together. \n\n\n\nIn general, every dimension or attribute may be used as an element in a query to compute the revenues for the specified elements. For instance, such computation can be the revenues for a certain region within a certain time and for a certain product.  \n\n\n\nWe also see that the dimensions can be seen as hierarchies such as the store and its located country. This way, one may compute totals and subtotals based on such hierarchies such as countries or the address. \n\n\n\nHow to Slice Time Series Data\n\n\n\nSlicing is the operation of cutting a specific slice out of the data structure that is commonly applied when Time Series Data Analysis is done. \n\n\n\nThe data gets cut down to various dimensions, elements or attributes. We visualize different slices in the following image &#8211; there we see the slices for the various dimensions of the sample data. \n\n\n\nSlices discriminate and cut out different perspectives of the records in Time Series Analysis. For instance, the store dimension can be fixed to a specific dimension member and therefore the slice for this specific store is created. However, slicing is possible for all dimensions and their attributes and the image just shows samples of how a dataset can be discriminated against with slices [2].\n\n\n\nWe use a slice in the following. The slice of &#8220;All Stores and Dates for the Product cup&#8221; discriminates the dataset to the fixed value of a cup. \n\n\n\n Table 2A slice of a Time Series Data Set: All Stores and Dates for the Product Cup\n\n\n\nWe see that just the stores in Madrid and Heidelberg are the remaining cup sellers. This way, decision-makers can easily view the total revenues for the cup product at hand and compare it with the totals. Then, they can use those totals to compare the different stores to the total revenues and see which store is the leading cup seller.\n\n\n\nSimilarly to the slice before, we present the slice of &#8220;All product and Dates for the Heidelberg Store&#8221; below. There we can see that the Heidelberg store sold three products for the total amount of 8\u20ac. \n\n\n\nOut of this data, charts or other graphical visualization can be populated for a specific store. Such charts may give decision-makers insights into whether the revenues for a specific store increase over time. We show a simple visualization sample in the figure below. \n\n\n\nAll Product and Dates for the Heidelberg Store \n\n\n\nSlice Chart of Time Series Data. Sample visualization of the table. The original data has been used to compute pie charts to give executives a visual overview.\n\n\n\nLastly, we show the slice of &#8220;All Stores and products for the first Date Quarter&#8221;. We see that only Heidelberg and Paris sold products in Q1. Furthermore, the revenues compared to the total are much higher for Paris than for the Heidelberg store. \n\n\n\nAll Stores and Products for the first Date Quarter\n\n\n\nHowever, in general, such slices are the beginning of further Time Series analysis and the building of total sums and similar aggregations is a starting point to do advanced investigations. \n\n\n\nOur example contained only limited cases that are easy to overview to demonstrate the general idea of how data can be sliced. In reality, we have to imagine thousands and millions of records that are hard to overview. Through slicing, the amount can be dramatically reduced. \n\n\n\nImagine not only to compute totals for Time Series Data but also subtotals and similar. For instance, subtotals may be computed for the specific stores, products and lead to new Time Series analysis insights. \n\n\n\nHere, we focused on the generic approach to slice data, to reduce a result set and demonstrate advanced analysis capabilities later on. For us, it is important to note, that slicing fixes a dimension on a certain member (e.g. the specific store, location or interval) in order to reduce the original dataset.\n\n\n\nDice Time Series Data in a subcube\n\n\n\nSimilar to the slice operation discussed &#8211; is the Dice operation. The dice operation combines multiple slice operations at one time to create a subcube. \n\n\n\nDice operations discriminate datasets to a subset of the original data. In general, dicing is done by using multiple slices together [2].\n\n\n\nIn the picture, we see how different dimensions are fixed to specific values and a subcube is extracted. For our demo data, we present the remaining subcube in the following table. \n\n\n\n Time Series Data Dice Result\n\n\n\nThe table shows that two datasets remain. Furthermore, neither slicing nor dicing change the dimensionality of the  Time Series Data. All dimensions that were existing before are still remaining in the result. \n\n\n\nIn general, such a Dice operation is often used to start analysis for a specific entity. Imagine a local executive who is responsible for the Heidelberg store and wants to do the future planning for the store in Heidelberg for 2012. \n\n\n\nHe needs to determine how many cups he needs now and how many he has needed in the past periods to go into negotiations with the cup vendor. In order to do so, he is interested in the historical data of the year 2011 for his specific store. \n\n\n\nHe is not interested in the data of the whole company and focuses on his store. So, all computations that he wants to do are based on the Heidelberg- 2011- cup -result. If he would look at the whole company data, it would be overwhelming and unimportant information for him. \n\n\n\nAdvanced Operations\n\n\n\nBefore, we discussed the basic Slice and Dice operation as Time Series Analysis methods. We saw that the data representation was uniform and the operations can be used to reduce the dataset. We provide a visualization on how our demo cube can be explored with some advanced multi-dimensional time series analysis. \n\n\n\nAdvanced multi-dimensional Time Series Analysis: Different operations are applied to a dataset that is presented in (a). From (a) to (b) a Merge is applied. In order to transform the data to (c), a Roll up is applied. This Roll up is applied to all dimensions in (d).\n\n\n\nWe show a subset of the original time series data that can be created with a Dice operation in a.). This subcube is the foundation to apply different operations and to do investigations with them. The other cubes b.), c.) and d.) show the outcome after different operations.\n\n\n\nWe indicate which original dimensions and attributes result in the different outcomes by linking them with dashed lines. \n\n\n\nIn the following, we discuss each resulting perspective on its own. In order to provide a better understanding, we show resulting tables in the way they are commonly used by  Time Series Database or Data Warehouse exploration tools.\n\n\n\nWe also show different methods of building subtotals and totals in order to provide indications of what can be done in practice. \n\n\n\nNonetheless, since some operations have inverse operations that are named different we refer to both operations in the following. \n\n\n\nFor instance, when a.) is the origin as an operation to compute cube b.), it is called split. Whereby b.) to a.) is called merge. \n\n\n\nWe always mention first the &#8220;a.) to&#8221; operation name and then the inverse name in case it may exist. \n\n\n\nWith that naming guide, we first focus on the original data set in a.) and then follow up with the different operations.\n\n\n\n (a) Dice of Origin before applying advanced multi-dimensional analytic operations\n\n\n\nWe provide the point of origin in the table above, before applying advanced multi-dimensional time-series operations in a.). \n\n\n\nIn reference to our sample dataset from the beginning, we see that this dataset is a subset of the sample data that was created by a Dice operation. Therefore, only a specific product, store, date and revenues are contained in the records.  \n\n\n\nWe imagine that a controller looks at the data at the beginning of his investigation. He generates charts from it and applies analytic operations. Then he decides which operations he applies for further explorations. Therefore, we imagine that he can end up with the operations that we present in the following. \n\n\n\nSplit, Merge, Pivot \n\n\n\nThe multi-dimensional Time Series Analysis Split operation is used to generate b.). Vice versa, as inversive operation Merge generates a.) from b.). The split operation increases the dimensionality of the cube. \n\n\n\nSplit can be applied to all kinds of dimensional attributes to get into specific details. Such attributes may be the store size, a store type or similar to regard every possible angle. \n\n\n\nAll together, Split and Pivot can be used to increase the dimensionality and arrange the dimensions in the desired way to reveal desired details.\n\n\nPivot is used to change the viewpoint. and to rotate rows into columns to ease computations. Therefore, often Split is applied and then the new dimensions coming out of Split are then rotated from rows to columns. Hence, we describe the outcome of a Split and Pivot operation together in the following.&nbsp;\n\n\nSplit and Merge\n\n\n\nWe can see that the Date is split into the year and the quarter at the same time. In contrast to the three dimensions (Store, Product, Date) in cube a.), cube b.) has now four (Year, Store, Product, Quarter). \n\n\n\nWe present the result data in the following table after we applied an additional Pivot operation for a better viewpoint. In the following, we explain the Pivot operation in more detail. \n\n\n\nSplit and Pivot:The Split operation increases the dimensionality of the Time Series Data and Pivot rearranges the data perspective.\n\n\n\nPivot (also called Rotate)\n\n\n\nPivot alters the content of an axis in a spreadsheet. Therefore, it is also called rotate, because a dataset is rotated in itself.\n\n\n\nUltimately, the spreadsheet shows quarters arranged vertically for the years. This makes it possible to build subtotals for both; years and quarters. When we imagine a larger dataset, this gets even handier. \n\n\n\nCurrently, our time-series data has no duplicate sold products in different quarters, but we see that such issues can be handled through the total aggregation that is shown vertically. We see in the resulting totals that the quarter with the most revenues in total in Q1, followed by Q2. Furthermore, there was a rise in revenues from 18\u20ac to 23\u20ac.\n\n\n\nIn this way, decision makers have different possibilities to Pivot the data to analyze it from different perspectives.\n\n\n\nRoll-Up and Drill-Down\n\n\n\nRoll-Up (c.) is decreasing the granularity of a dimension or a dimensional hierarchy. It is the &#8220;zoom-out&#8221; operator. The opposite operator is the drill down. We use this roll up operator in c.). \n\n\n\nThere, we see the product dimension generalized to the product group and all the revenues are aggregated and computed together on this level. One group may contain multiple product types, but one product has only one group. This way, the product group is a reduction of different elements. \n\n\n\nRoll-Up is decreasing the granularity of a dimension or a dimensional hierarchy of Time Series Data.\n\n\n\nWe see the revenues for the different product groups, dates and stores. Like before, it is possible to build subtotals like for dishes or a specific store. With such subtotals, we can depict easily that dishes have been the main sold product in 2011 in the Heidelberg store.\n\n\n\nHowever, in general, Roll-Ups and Drill-Down are very effective to gain an overview or an insight into a dimensional hierarchy or attributes of a dimension. \n\n\n\nWe apply a Pivot product group and exclude the date dimension to get a better overview. \n\n\n\nPivot and Dimension Reduction\n\n\n\nNow, we can easily depict the totals for the different stores and the various Product Groups. This reveals that furniture is responsible for the most revenues and Paris is identified to be the Store leader in revenues. \n\n\n\nSuch kind of analysis in business is used to compare the performance of stores and product groups before drilling into details.\n\n\n\nAs the last operation, we show the combination of different combined Roll-Ups at the same time. \n\n\n\nGeneral Roll-up together with Pivot\n\n\n\nIn the table, we now see the different countries in relation to Years and Product Groups. This makes it possible to depict the top Product Groups and the top-performing Countries at once. \n\n\n\nThe date has been generalized to Year, Products to the Product Group and Store to its Country. The general idea is to compare the performance of countries in different years. \n\n\n\nIn Germany, we have multiple stores and all of these stores are ragged together and their revenues are aggregated. Imagine, we have a large dataset, where all countries have multiple stores. Executives can have a look at the high-level results and work out their decisions.\n\n\n\nTypical applications of multi-dimensional Time Series Analysis operations\n\n\n\nTypical applications of multi-dimensional Time Series Analysis are data preparations, extractions and investigations for a certain timeframe or interval to learn more about events and data.\n\n\n\nSlices or Dices can also support extracting data for a certain time frame to test machine learning on small examples within a certain timeframe. \n\n\n\nAnother application type is recommendation systems where different dimensionalities are used to do investigations about features. \n\n\n\nLast but not least structured analysis, descriptive analytics and time-based aggregations are also other typical examples of where the described operations help.\n\n\n\nSum Up FAQ\n\n\n\t\t\n\t\t\t\tWhat are the different analytical operations from Data Warehousing which can be applied for Time Series Analysis?\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\n\nSlice, Dice, Pivot, Roll-Up, Drill-down, Split and Merge \n\n\t\t\t\n\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tWhat is the Slice operation for Time Series Analysis?\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\n\nSlice fixes a specific dimension to a specific value. For instance, the store dimension can be fixated to a specific store.\n\n\t\t\t\n\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tWhat is the Dice Operation on Time Series Data?\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\n\nThe dice operation combines multiple slice operations at one time to create a subcube.  \n\n\t\t\t\n\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tWhere do the common multi-dimensional operations for Time Series Analysis originate?\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\n\nThey originate from Data Warehousing and in special the area of OLAP.\n\n\t\t\t\n\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tWhat is the Pivot operation on Time Series Data?\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\n\nPivot is used to change the viewpoint and to rotate rows to columns and vice versa.  \n\n\t\t\t\n\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tWhat is the Split operation used for when doing Time Series Data investigations?\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\n\nSplit is used to increase dimensionality whereby dimensions are arranged orthogonally.  For instance, to split dates into years and quarters. This way, the different quarters of years can be compared.\n\n\t\t\t\n\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tWhat is the inverse operation of Split for Time Series Analysis?\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\n\nMerge\n\n\t\t\t\n\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tWhat is a Drill-Down and a Roll-up when analyzing Time Series Data?\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\n\nRoll-Up is decreasing the granularity of a dimension or a dimensional hierarchy (e.g. country to the continent). It is the &#8220;zoom-out&#8221; operator. The opposite operation is the Drill-Down.  \n\n\t\t\t\n\t\t\n\t\t\n\t\t\t\t\n\t\t\t\tWhat is Time Series OLAP?\t\t\t\t\n\t\t\t\t\t\t\n\t\t\t\t\n\nOriginally, multi-dimensional operations (OLAP operations) are applied in a Data Warehouse on top of Time Series data which comes classically from ERP systems. In Big Data Analysis the same operations as in a Data Warehouse are executed manually with Big Data Tools or Time Series Databases what is then referred to as Time Series OLAP.\n\n\t\t\t\n\t\t\n\t\t\n\t\t\n\n\n\n\nSummary and conclusion\n\n\n\nWe saw examples of the multi-dimensional Time Series Analysis operations that can be applied on Time Series Data. In special, we looked at Slice, Dice, Pivot, Roll-Up, General Roll-Up and Drill-down. Then, we referenced some exemplary applications where such operations are handy and concluded with a summary FAQ.\n\n\n\nNow, Data Scientists and Big Data Engineers can qualify Big Data tools like Time Series Databases by their capability to execute the different operations on top of Time Series Data. \n\n\n\nWith the right tools, the data can be pre-processed better and faster before using advanced Time Series Data investigation methods. This speeds up the flexibility and improves the speed of how new insights and forecasts from Time Series Data can be revealed.\n\n\n\n\n\t\t\t\n\t\t\tReferences\n\t\t\t\n\t\t\n\t\t\t\n\n V. K\u00f6ppen, G. Saake, K.-U. Sattler. Data Warehouse Technologien: Technische Grundlagen. 978-3826691614. mitp Professional. 2012. H.-G. Kemper, W. Mehanna, C. Unger. Business Intelligence- Grundlagen und praktische Anwendungen: : Eine Einf\u00fchrung in die IT-basierte Managementunterst\u00fctzung. 978-3834802750. Vieweg+Teubner Verlag. 2004.\n\n\n\t\t\n\n\n\nUpdate (29th April 2021):&nbsp;Check out our latest project, Fahrbar!\n\n\n\n\nFahrbar: Just What We Need For Public Transport Crowding\n\n\n\n\n\t\t\t\n\t\t\t\tGet in touch with us\n\t\t\t\n\t\t\t\n\t\t\t\tIf you are interested in Fahrbar or want to find out how we can help you leverage your data\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\tContact us\n\t\t\t\t\n\t\t\t\n\t\t\n\n\nRelated Posts\n\n\n\n\nMultimodal Transport Routes And Why Many Of Them Suck\n\n\n\n\n\nThe 101 On The Underrated Topic Of Master Data\n\n\n\n\n\nA Quick Look At Slowly Changing Dimensions\n\n\n\n\n\nEssential Data Source Origins You Need To Know\n\n\n\n\n\nTop 5 Big Data Time Series Applications\n\n\n\n\n\nWhat are Time Series Data Models and Analysis?\n\n\n\n\n\nConcepts and Characteristics of Big Data Analytics\n\n\n\n\n\nPersonalized e-Commerce Recommendations: The Secret to More Sales\n\n\n\n\n\nThe NGO With The Bicycle Referendum &#038; Its Big Data Relevance", "datePublished": "2020-03-05T11:44:30+01:00", "dateModified": "2021-09-23T07:02:04+01:00", "url": "https://www.iunera.com/kraken/fabric/multi-dimensional-time-series-analysis-olap/", "author": "Tim", "image": "https://www.iunera.com/wp-content/uploads/multi-dimensional-time-series-analysis-reveal-different-viewpoints.jpg?v=1583356711", "articleSection": "Time Series Analytics", "keywords": "bigdata, businessIntelligence, dataScience, dataWarehouse, dice, drillDown, multiDimensional, OLAP, pivot, rollUp, slice, timeSeries, timeSeriesAnalysis"}}], "query_id": ""}

data: {"message_type": "result_batch", "results": [{"url": "https://www.iunera.com/kraken/sustainability/bike-friendly-cities/", "name": "What You Need To Know About Bike-friendly Cities", "site": "iunera", "siteUrl": "iunera", "score": 60, "description": "This article provides a comprehensive overview of bike-friendly cities, discussing their benefits, examples, and data-driven approaches to improving cycling infrastructure. It is somewhat relevant as it touches on sustainability and urban planning, which are broad topics that could align with various inquiries.", "schema_object": {"@context": "https://schema.org", "@type": "Article", "headline": "What You Need To Know About Bike-friendly Cities", "description": "Is it really good to have bike-friendly cities and how do we get them? Let's find out in this article.", "articleBody": "Is it really good to have bike-friendly cities and how do we get them? Let&#8217;s find out in this article.\n\n\n\n\t\t\t\n\t\t\t\tTable for cyclists of contents\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\t\n\t\t\t\tCycling boomBike-friendly citiesData collection and analysis to determine and improve bike-friendlinessImagining velotopias or something along those lanes linesRelated Posts\n\t\t\t\n\t\t\n\n\nCycling boom\n\n\n\nSaid to be &#8220;one of the fastest, most flexible and reliable methods of transport&#8221;, cycling has boomed further thanks to Covid-19. The lockdowns motivated many people to roll out their bicycles or buy some to join the bandwagon and pedal their way to fun and health. Here are some figures to describe the surge in bike demand:\n\n\n\nPeople for Bikes in Colorado, USA, reported that 1/10 adults claimed to ride a bike for the first time in at least one year.Rails-to-Trails Conservancy reported that cycling on US trails peaked at 217% in the first week of April 2020 compared to the first week of April 2019.Strava reported that 8.1 billion miles were logged by cyclists around the world in 2020.Figures from the Bicycle Association in the UK show that bike sales and bike servicing increased by 41%.The Mobility Monitor of Germany reported that 32% of the people interviewed are cycling more in response to Covid-19 and 27% want to continue cycling more after the pandemic.\n\n\n\nAlthough the avoidance of public transport and bike-sharing boom partially contributed to the cycling boom, it seems that the government incentives or temporary &#8220;Copenhagenize&#8221; adjustments like pop-up cycling lanes and cycling superhighways played a bigger role in the boom.\n\n\n\nHowever, suppliers struggled to keep up with the surge in bicycle demand, especially with the restrictions disrupting global supply chains. One big retailer in the UK described the demand as if customers were snatching available stock &#8220;like piranhas&#8221;. Seeing that cycling is on the rise despite talks of whether the cycling boom will stick, it might be worth learning more about bike-friendly cities, pandemic or not.\n\n\n\nBike-friendly cities\n\n\n\n&#8220;Copenhagenize&#8221; is used to describe cities that are working to become more bike-friendly. Of course, the terms Copenhagenize and, hence, the Copenhagenize Index (more on this later) were modelled after the city of Copenhagen, well-known for its utmost bike-friendliness like its popular Dutch counterpart. \n\n\n\nIn fact, both Amsterdam and Copenhagen have long been bike-friendly cities, earned after much struggle in the 1970s. Since Copenhagen is home to 675,000 bicycles which is 5 times more than cars, 29% of all journeys and 41% of commutes to work or study across Copenhagen are by bike. \n\n\n\nThanks to the availability of safe cycling paths and the ongoing investment in improving and adding cycling infrastructure, the city not only reaps the obvious health and environmental benefits, but also economic benefits of 4.80 krone per km of cycling and 10.09 krone per km of replacing a motor drive with cycling.\n\n\n\nThese economic benefits arise from shorter commute times (from less traffic jams),&nbsp;less sick leave and more retail spending (since more people cycle to shops). Imagine multiplying those gains by millions of kilometres.\n\n\n\nSimilarly, Freiburg is bike-friendly and has a well-planned transport system, among other things that has earned the so-called ecological capital of Germany international recognition for being livable, sustainable, child-friendly and historically preserved. Since Freiburg prioritises pedestrians, cyclists and public transport, the city is bursting with the following privileges: \n\n\n\nThe city centre is closed for private motor vehicles.Time restrictions are also placed on motor vehicles.Right of way is given to bicycles.There is a 400km network of cycling routes.There is an ever-expanding city tram network.The streets were designed with the harmony of cycling lanes, pedestrian zones, and tram lines in mind.\n\n\n\nMaybe because of Amsterdam, Copenhagen, and Freiburg, it&#8217;s no surprise that other countries have tried to follow suit with their attempts to create bike-friendly cities in their own backyard. Efforts in some of the cities have been successful, while efforts in others yielded questionable results.\n\n\n\nOne example of a success story is Mackinac, a bike-only island in Michigan, USA. The Michigan state&#8217;s website mentioned that the island boasts over 70 miles of cycling trails, 170 bikes per annual resident, a population of under 500 islanders and yet 85,000 annual bike licenses. \n\n\n\nWhy the gigantic number of licenses for such a tiny population? Because most of the licenses go to the ferried visitors the island attracts, and since motor vehicles are banned here, visitors either bring their own bikes or rent one.\n\n\n\nShifting a little more southwest to Colorado, there is a proposal to build a Dutch-style green bike paradise called Cyclocroft. \n\n\n\nAccording to insights from InsideHook, cute Mr Money Mustache blogger Pete Adeney and Dutch design company B4place joined forces to design Cyclocroft, which is roughly the size of 484 football fields to fit 50,000 people. While the concept seems praiseworthy, it has attracted criticism for being isolated rather than integrated into mainstream society, which is what governments should be striving for.\n\n\n\nEven if a government did aim to have something like this, things don&#8217;t necessarily turn out the way they want, as is the case for Pune&#8217;s multiple failures to revive its &#8216;City of Cycles&#8217; status in India most probably due to a lack of alignment and clear priorities as well as behavioural challenges. But Pune should still not be deterred by its failures as the benefits of complete bike networks may outweigh the risks.\n\n\n\nData collection and analysis to determine and improve bike-friendliness\n\n\n\nBesides facilitating cycling navigation, data can be collected and analysed to identify issues with cycling infrastructure and improve its design. As the Copenhagenize Index is &#8220;the most comprehensive and holistic ranking of bicycle-friendly cities on planet earth&#8221;, how is this ever-evolving index measured? According to its website:\n\n\n\nData of 600 cities worldwide are first gathered and stored in its database. The cities with a bicycle usage percentage of above 2% go to the next round of analysis.The cities are then assigned between 0 and 4 points across the streetscape, culture and ambition parameters, and bonus points for exceptional performance. Below are the parameters crucial for indicating bike-friendly success:\n\n\n\nThe streetscape parameters: bicycle infrastructure, bicycle facilities, and traffic calming.The culture parameters: gender split, modal share for bicycles, modal share increase over the last 10 years, indicators of safety, image of the bicycle, and cargo bikes.The ambition paramaters: advocacy, politics, bike share and urban planning.\n\n\n\nOnce the level of bike-friendliness is determined, the aspects of infrastructure that need improvement can then be identified. The European Cyclists&#8217; Federation (ECF) explained that the location-based satellite data collected from mobile cycling apps can be used to implement cycling policies and infrastructure based on paramaters like preferred routes, number of cyclists, speed, delays at intersections, places of high demand, bike path issues and feedback.\n\n\n\nSadly, I&#8217;ve not spotted any mention of how the parameters are processed by Copenhagenize Index and ECF. But perhaps, the 2019 paper by Hong, McArthur and Livingston could have the answer to Evaluating Large Cycling Infrastructure Investments In Glasgow Using Crowdsourced Cycle Data. \n\n\n\nThe researchers collected 2013-2016 cycling data from Strava, Glasgow cycling infrastructure data and manual cordon counts of cyclists in Glasgow in 2014. They ran the data through a fixed effects Poisson panel data regression model to evaluate whether the cycling infrastructure (built partly for the 2014 Commonwealth Games) encouraged more people to cycle on these routes. This model was chosen for its ability to consider &#8220;unobserved heterogeneity&#8221;.\n\n\n\nImagining velotopias or something along those lanes lines\n\n\n\nThere&#8217;s so much to learn from bike-friendly cities. Successful bike-friendly cities set good examples for others to follow while unsuccessful cities&#8217; mistakes can serve as lessons for what can be done better. However, as the Copenhagenize Index website mentioned, &#8220;bicycle friendliness can come in many shapes and forms, with each new step offering critical utility to the urban citizens that need it.&#8221; \n\n\n\nThis means that anyone attempting to make a city more bike-friendly need to consider what is possible for the city&#8217;s needs and not just copy and paste what other cities do, whether it&#8217;s coming up with outstanding ideas for a bike-oriented city (a.k.a. velotopia) in Sydney or placing a line of potted plants as a safety-guaranteed barrier instead of just painting roads.\n\n\n\nBut let&#8217;s keep in mind that one thing holds true universally and that is what car drivers can do to drive in harmony with cyclists on the road. Here are some tips from crowdsourced navigation app Waze:\n\n\n\nCheck for anyone outside before you open your door to avoid hitting someone, particularly a cyclist.Keep a distance from cyclists.Double-check your blind spot before turning to avoid hitting anyone.Strictly leave the cycling lane for cyclists. Be patient.\n\n\n\n\u201cEveryone has peace on the road when everyone has a piece of the road.\u201d Sara Studdard, director of local innovation at People For Bikes (quoted by Waze).\n\n\n\nRelated Posts\n\n\n\n\nThe NGO With The Bicycle Referendum &#038; Its Big Data Relevance\n\n\n\n\n\n6 Sustainability Efforts That Can Leverage Mobile Data Analytics\n\n\n\n\n\n10 Big Data-Driven Sustainability Use Cases You Should Know\n\n\n\n\n\nThe European Green Deal Is A Big Deal For Big Data\n\n\n\n\n\nMultimodal Transport Routes And Why Many Of Them Suck\n\n\n\n\n\nThe Ultimate Guide To The Demand For Public Transport Routes\n\n\n\n\n\nhttps://www.iunera.com/kraken/open-big-data-science-academy/iot/", "datePublished": "2021-06-21T10:48:11+01:00", "dateModified": "2022-02-25T15:07:06+01:00", "url": "https://www.iunera.com/kraken/sustainability/bike-friendly-cities/", "author": "Dhanhyaa", "articleSection": "Sustainability", "keywords": "bicycle, bicycle-friendly, big data, big data-driven, bike, bike-friendly, cities, city, economic sustainability, environmental sustainability, public transport, publicTransport, social sustainability, sustainability, sustainable, sustainable development, sustainable development goals, sustainable mobility"}}], "query_id": ""}

data: {"message_type": "result_batch", "results": [{"url": "https://www.iunera.com/kraken/projects/itb-digital-touchpoints-demo-with-flinkster-and-talk/", "name": "@ITB &#8211; Digital touchpoints demo with Flinkster and talk", "site": "iunera", "siteUrl": "iunera", "score": 65, "description": "This article discusses a demo involving digital touchpoints with real-time information connected to indoor iBeacons and outdoor geolocations, featuring car-sharing services like Flinkster and Call a Bike by Deutsche Bahn. It is relevant as it covers technology and projects related to location-based services and public transport data, which can be pertinent for understanding context-aware applications and geotargeting.", "schema_object": {"@context": "https://schema.org", "@type": "Article", "headline": "@ITB &#8211; Digital touchpoints demo with Flinkster and talk", "description": "We were really thrilled to be selected for a talk at the eTravelWorld.Thereby, we showed how real-time information can be connected to indoor iBeacons and outdoor geo positions.In addition to the pure theory, we were lucky to spice things up with a demo together with Flinkster car-sharing and Call a Bike of Deutsche Bahn. Listeners...", "articleBody": "We were really thrilled to be selected for a talk at the eTravelWorld.Thereby, we showed how real-time information can be connected to indoor iBeacons and outdoor geo positions.In addition to the pure theory, we were lucky to spice things up with a demo together with Flinkster car-sharing and Call a Bike of Deutsche Bahn. Listeners of the talk were able to see the real-time data of Deutsche Bahn in action on an iBeacon, indicating when and where you can rent the car, nearest to you.\nhttp://www.slideshare.net/TimFrey2/converting-travelers-into-customers-with-digital-touchpoints\n \n\n\nUpdate (29th April 2021):\u00a0Check out our latest project, Fahrbar!\n\n\n\n\nFahrbar: Just What We Need For Public Transport Crowding\n\n\n\n\n\t\t\t\n\t\t\t\tGet in touch with us\n\t\t\t\n\t\t\t\n\t\t\t\tIf you are interested in Fahrbar or want to find out how we can help you leverage your data\n\t\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\tContact us", "datePublished": "2017-03-11T10:49:00+01:00", "dateModified": "2021-09-23T06:39:42+01:00", "url": "https://www.iunera.com/kraken/projects/itb-digital-touchpoints-demo-with-flinkster-and-talk/", "author": "Tim", "image": "https://www.iunera.com/wp-content/uploads/2020/01/2017-03-05-17.16.37.png", "articleSection": "Our Projects", "keywords": "android, app, conference, contextaware, geotargeting, ibeacons, locationbased, talk"}}], "query_id": ""}

data: {"message_type": "result_batch", "results": [{"url": "https://www.iunera.com/kraken/jobs/angular-engineer-for-data-science-backend/", "name": "Angular Engineer for Data Science backend", "site": "iunera", "siteUrl": "iunera", "score": 60, "description": "This article describes a job opportunity for an Angular Engineer working on a Data Science backend, highlighting skills in Angular Material and Big Data technologies. It is relevant due to its focus on software development and data science, though the lack of a specific question limits direct applicability.", "schema_object": {"@context": "https://schema.org", "@type": "Article", "headline": "Angular Engineer for Data Science backend", "description": "Advance your Data Science career by advancing a progressive web app in Angular Material to interface even better with a Big Data and Data Science backend. With each increment you develop, you advance your skills within the Big Data Science area. About iunera Ideas and people resonate when they are communicated and get executed together....", "articleBody": "Advance your Data Science career by advancing  a progressive web app in Angular Material to interface even better with a Big Data and Data Science backend.\n\n\n\nWith each increment you develop, you advance your skills within the Big Data Science area.\n\n\n\nAbout iunera\n\n\n\nIdeas and people resonate when they are communicated and get executed together.\n\n\n\nResonance is relevant.\n\n\n\nRelevance is progress.\n\n\n\nProgress shapes the world!\n\n\n\nTherefore, iunera was started with the idea that technological process is achieved by hands-on and by an open culture.\n\n\n\nTogether, we build and leverage existing Big Data Tools together to support customers as partners on their journey to gain more value out of their data.\n\n\n\nWe believe in empowerment and growth to achieve the best service for our customers. Thus, it matters most who you are, what you can do, and what we can achieve together.\n\n\n\nFor this reason, iunera appreciates applications that contain insights about personal experiences and hands-on.\n\n\n\nHow we work\n\n\n\nAgileVirtual stand up meetingsTrusting and delivering on promisesIncremental improvement or processes and deliverablesStartup attitudeInternational team players\n\n\n\nYour skills\n\n\n\nExperienced in Angular, Angular Material and JavascriptExperienced in writing maintainable codeKnowledge of design patternsAdditional programming languages are a plusResult-driven working attitude and desire to finish tasks.Experienced in the usage of programming tools (e.g. git)High quality of spoken and written English.Autonomous working attitude.\n\n\n\nTasks and responsibilities\n\n\n\nParticipate in regular scrum meetingsContribute with own architectural ideas in development processesUse cutting edge Big Data and open source technology\n\n\n\nHow we meet\n\n\n\nPlease be aware that a video interview will be scheduled. Therefore, please do only apply if your notebook is equipped with the necessary hardware\n\n\n\nYour compensation\n\n\n\nFor this position, there are different working modes available:\n\n\n\nFreelancePart-timeMonthly-based compensation\n\n\n\nYour application\n\n\n\nYour application shall contain different documents.\n\n\n\nSeparate from your application in the email or an extra document: Compensation expectationsReference to some projects that you have developed in the past. In case no public project is available, please attach a description of what the project was about.Your personal details (education, experience, age\u2026)Optional: Brief motivational letter (max. 10 lines)\n\n\n\nApplication email: hrcareers (at.) iunera.com", "datePublished": "2020-11-13T06:36:35+01:00", "dateModified": "2021-11-04T05:58:14+01:00", "url": "https://www.iunera.com/kraken/jobs/angular-engineer-for-data-science-backend/", "author": "Tim"}}], "query_id": ""}

data: {"message_type": "result_batch", "results": [{"url": "https://www.iunera.com/kraken/nlweb/nlweb-deployment-in-kubernetes-gitops-style-with-fluxcd/", "name": "NLWeb Deployment in Kubernetes GitOps Style with FluxCD", "site": "iunera", "siteUrl": "iunera", "score": 60, "description": "This article provides an in-depth overview of deploying NLWeb, an AI-driven web platform, using Kubernetes and GitOps with FluxCD. It is relevant as it covers advanced deployment strategies, integration with multiple AI providers, and production-ready configurations. The relevance is limited by the absence of a specific question to target.", "schema_object": {"@context": "https://schema.org", "@type": "Article", "headline": "NLWeb Deployment in Kubernetes GitOps Style with FluxCD", "description": "NLWeb is revolutionizing web development by integrating advanced machine learning and AI, as showcased in recent insights. Picture NLWeb as a powerful kraken, its tentacles infused with the Kubernetes (K8s) logo, symbolizing the robust infrastructure behind these AI-driven solutions. NLWeb delivers dynamic, intelligent websites that adapt and evolve, offering unparalleled user experiences.\n\n", "articleBody": "The landscape of AI-powered web applications is evolving rapidly, and at the forefront of this revolution stands NLWeb \u2014 Microsoft&#8217;s groundbreaking open-source protocol that transforms traditional websites into intelligent, AI-driven knowledge hubs. When combined with Kubernetes container orchestration platform and GitOps methodologies, NLWeb creates a production-ready ecosystem that&#8217;s both scalable and maintainable. This comprehensive guide explores how to deploy NLWeb using modern DevOps practices, leveraging the power of FluxCD for continuous deployment and Azure&#8217;s robust cloud infrastructure for Kubernetes.\n\n\n\nDiscover the power of NLWeb when using K8s as operations kraken.\n\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\n\t\t\t\n\t\t\t\tWhat Makes NLWeb Revolutionary in the AI Web Space?Understanding the GitOps Advantage for NLWeb DeploymentsTechnical Architecture: NLWeb on KubernetesFluxCD Integration: Continuous Deployment Made SimpleAzure Integration: Cloud-Native AI InfrastructureProduction-Ready Features and Best PracticesDeployment Comparison: NLWeb vs Traditional ApproachesAdvanced Configuration ExamplesSecurity Considerations and Best PracticesConclusionFrequently Asked Questions\n\t\t\t\n\t\t\n\n\nWhat Makes NLWeb Revolutionary in the AI Web Space?\n\n\n\nNLWeb represents a paradigm shift in how we think about web applications. Unlike traditional static websites or even dynamic web applications, NLWeb enables AI-powered websites that can understand, process, and respond to user queries with unprecedented intelligence. The platform seamlessly integrates with vector databases, multiple LLM providers, and enterprise data sources to create truly interactive web experiences.\n\n\n\nThe protocol&#8217;s architecture is designed with modern cloud-native principles and CNCF best practices in mind. It supports multiple embedding providers including OpenAI, Azure OpenAI, Gemini, and Snowflake, while offering flexible LLM integration with providers ranging from Anthropic&#8217;s Claude AI assistant to Hugging Face models. This multi-provider approach ensures resilience and allows organizations to optimize costs while maintaining performance.\n\n\n\nKey Points:\n\n\n\n\nIntelligent Interactions: Enables natural language understanding and contextual responses\n\n\n\nMulti-Provider Support: Integrates with various AI providers for flexibility and redundancy\n\n\n\nNot yet Enterprise-Ready: Designed for production deployments with easy use in mind, it&#8217;s currently in early stage. We try to increase this by submitting bug fixes and enhancements.\n\n\n\n\nUnderstanding the GitOps Advantage for NLWeb Deployments\n\n\n\nGitOps declarative infrastructure management methodology has emerged as the gold standard for Kubernetes deployments, and NLWeb&#8217;s architecture perfectly aligns with this approach. By treating Git repositories as the single source of truth for infrastructure and application configurations, teams can achieve unprecedented levels of automation, auditability, and reliability.\n\n\n\nThe iunera helm charts repository provides production-ready Helm charts specifically designed for NLWeb deployments. These charts encapsulate years of operational experience and best practices, making it straightforward to deploy NLWeb in any Kubernetes environment while maintaining consistency across development, staging, and production environments. If you are interessented in a general purpose helmchart for basically any kind of simple deployment the Spring Boot chart is worth a look.\n\n\n\nFluxCD serves as the GitOps operator, continuously monitoring the Git repository for changes and automatically applying them to the Kubernetes cluster. This approach eliminates configuration drift, reduces manual intervention, and provides a complete audit trail of all changes made to the system.\n\n\n\nGitOps Benefits for NLWeb:\n\n\n\n\nDeclarative Infrastructure: Everything defined as code in Git repositories\n\n\n\nAutomated Deployments: Changes automatically applied when committed to Git\n\n\n\nVersion Control: Complete history of all configuration changes\n\n\n\nRollback Capability: Easy reversion to previous known-good states\n\n\n\nConsistency: Same deployment process across all environments\n\n\n\n\nNow, let&#8217;s explore the technical architecture of NLWeb on Kubernetes to understand how these components work together.\n\n\n\nAn different Use Case we&#8217;ve implemented is the Apache druid the deployment of a production grade using Druid Operators and FluxCD.  \n\n\n\nTechnical Architecture: NLWeb on Kubernetes\n\n\n\nCore Components and Configuration\n\n\n\nNLWeb&#8217;s Kubernetes deployment consists of several key components that work together to deliver AI-powered web experiences:\n\n\n\nApplication Layer: The core NLWeb application runs as a Python-based service, typically deployed using the iunera/nlweb Docker image. The application serves on port 8000 and includes comprehensive health checks for both liveness and readiness probes.\n\n\n\nConfiguration Management: NLWeb uses a sophisticated configuration system with multiple YAML files:\n\n\n\n\nconfig_webserver.yaml: Handles server settings, CORS policies, SSL configuration, and static file serving\n\n\n\nconfig_llm.yaml: Manages LLM provider configurations and model selections\n\n\n\nconfig_embedding.yaml: Controls embedding provider settings and model preferences\n\n\n\nconfig_llm_performance.yaml: Optimizes performance through caching and response management\n\n\n\n\nSecurity Context: The deployment implements Kubernetes pod security standards and best practices including:\n\n\n\n\nNon-root user execution (UID 999)\n\n\n\nRead-only root filesystem\n\n\n\nDropped capabilities\n\n\n\nSecurity contexts for both pod and container levels\n\n\n\n\nThis architecture provides a secure, scalable foundation for deploying NLWeb in production environments.\n\n\n\nHelm Chart Structure and Values\n\n\n\nThe NLWeb Helm chart provides extensive customization options through its values.yaml configuration:\n\n\n\nreplicaCount: 1\nimage:\n  repository: iunera/nlweb\n  pullPolicy: IfNotPresent\n\nservice:\n  type: ClusterIP\n  port: 8000\n\nenv:\n  - name: PYTHONPATH\n    value: \"/app\"\n  - name: PORT\n    value: \"8000\"\n  - name: NLWEB_LOGGING_PROFILE\n    value: production\n\n\n\n\nThe chart supports advanced features including:\n\n\n\n\nAutoscaling: Horizontal Pod Autoscaler configuration with CPU-based scaling\n\n\n\nIngress: NGINX ingress controller integration with SSL/TLS termination\n\n\n\nVolumes: Persistent volume claims, ConfigMaps, and EmptyDir volumes\n\n\n\nConfigmaps: Configure the NLWeb Configs like LLM, Vector Endpoint, etc from it\n\n\n\nSecurity: Pod security contexts and network policies\n\n\n\n\nFluxCD Integration: Continuous Deployment Made Simple\n\n\n\nFluxCD continuous delivery for Kubernetes is a critical component in the GitOps deployment strategy for NLWeb, providing automated continuous delivery capabilities. It connects your Git repository to your Kubernetes cluster, ensuring that any changes to your deployment manifests are automatically applied.\n\n\n\nHelmRelease Controller\n\n\n\nThe GitOps deployment of NLWeb leverages FluxCD&#8217;s HelmRelease custom resource to manage the application lifecycle. Here&#8217;s how the integration works:\n\n\n\napiVersion: helm.toolkit.fluxcd.io/v2beta1\nkind: HelmRelease\nmetadata:\n  name: nlweb\n  namespace: nlweb\nspec:\n  releaseName: nlweb\n  targetNamespace: nlweb\n  chart:\n    spec:\n      chart: nlweb\n      version: \">=1.1.0\"\n      sourceRef:\n        kind: HelmRepository\n        name: iunera-helm-charts\n        namespace: helmrepos\n  interval: 1m0s\n\n\n\nThis configuration ensures that FluxCD continuously monitors the Helm repository for updates and automatically applies them to the cluster. The interval: 1m0s setting means FluxCD checks for changes every minute, providing near real-time deployment capabilities.\n\n\n\nImage Automation and Version Management\n\n\n\nFluxCD&#8217;s image automation capabilities work seamlessly with NLWeb deployments. The system can automatically detect new container image versions and update the deployment manifests accordingly. This is particularly valuable for maintaining up-to-date deployments while ensuring proper testing and validation workflows.\n\n\n\nImage Policy Configuration\n\n\n\nNLWeb deployments leverage FluxCD&#8217;s image automation controllers to automatically update container images when new versions are published. This is configured through special annotations in the HelmRelease manifest:\n\n\n\nimage:\n  repository: iunera/nlweb # {\"$imagepolicy\": \"flux-system:nlweb:name\"}\n  tag: 1.2.4 # {\"$imagepolicy\": \"flux-system:nlweb:tag\"}\n\n\n\nThese annotations tell FluxCD to automatically update the image repository and tag values based on the image policy defined in the nlweb.imagerepo.yaml file. When a new image version is detected that matches the policy criteria, FluxCD automatically updates the manifest and commits the changes to the Git repository.\n\n\n\nImage Repository and Policy Configuration\n\n\n\nThe image automation is configured through two key resources defined in the nlweb.imagerepo.yaml file:\n\n\n\n# ImageRepository defines the Docker image repository to monitor\napiVersion: image.toolkit.fluxcd.io/v1beta2\nkind: ImageRepository\nmetadata:\n  name: nlweb\n  namespace: flux-system\nspec:\n  image: iunera/nlweb\n  interval: 10m\n  secretRef:\n    name: iunera\n\n---\n# ImagePolicy defines which image versions to select\napiVersion: image.toolkit.fluxcd.io/v1beta2\nkind: ImagePolicy\nmetadata:\n  name: nlweb\n  namespace: flux-system\nspec:\n  imageRepositoryRef:\n    name: nlweb\n  policy:\n    semver:\n      range: \">=1.0.0\"\n\n\n\nThe ImageRepository resource specifies:\n\n\n\n\nThe Docker image to monitor (iunera/nlweb)\n\n\n\nHow often to check for new versions (interval: 10m)\n\n\n\nAuthentication credentials for the Docker registry (secretRef: name: iunera)\n\n\n\n\nThe ImagePolicy resource defines the selection criteria for image versions using semantic versioning, in this case selecting any version greater than or equal to 1.0.0.\n\n\n\nAutomation Workflow\n\n\n\nThe complete automation workflow is managed by the ImageUpdateAutomation resource:\n\n\n\napiVersion: image.toolkit.fluxcd.io/v1beta2\nkind: ImageUpdateAutomation\nmetadata:\n  name: flux-system\n  namespace: flux-system\nspec:\n  git:\n    checkout:\n      ref:\n        branch: master\n    commit:\n      author:\n        email: fluxcdbot@nodomain.local\n        name: fluxcdbot\n      messageTemplate: |\n        Automated image update\n\n        Automation name: {{ .AutomationObject }}\n\n        Files:\n        {{ range $filename, $_ := .Changed.FileChanges -}}\n        - {{ $filename }}\n        {{ end -}}\n\n        Objects:\n        {{ range $resource, $changes := .Changed.Objects -}}\n        - {{ $resource.Kind }} {{ $resource.Name }}\n          Changes:\n        {{- range $_, $change := $changes }}\n            - {{ $change.OldValue }} -> {{ $change.NewValue }}\n        {{ end -}}\n        {{ end -}}\n    push:\n      branch: master\n  interval: 30m0s\n  sourceRef:\n    kind: GitRepository\n    name: flux-system\n  update:\n    path: ./kubernetes/common\n    strategy: Setters\n\n\n\nThis resource:\n\n\n\n\nChecks out the Git repository&#8217;s master branch\n\n\n\nConfigures commit details with a template that includes what was changed\n\n\n\nPushes changes back to the master branch\n\n\n\nRuns every 30 minutes\n\n\n\nUpdates files in the ./kubernetes/common path using the &#8220;Setters&#8221; strategy (looking for image policy annotations)\n\n\n\n\nWith this configuration, the NLWeb deployment automatically stays up-to-date with the latest compatible container images without manual intervention, while maintaining a complete audit trail of all changes through Git history.\n\n\n\nDocker Build and CI/CD Pipeline\n\n\n\nThe NLWeb Docker image build and deployment process follows a comprehensive CI/CD pipeline that integrates with the FluxCD GitOps workflow:\n\n\n\nDockerfile Structure and Multi-Stage Build\n\n\n\nThe NLWeb Dockerfile uses a Docker multi-stage build process for optimized container images to create an efficient and secure deployment package:\n\n\n\n# Stage 1: Build stage\nFROM python:3.13-slim AS builder\n\n# Install build dependencies\nRUN apt-get update &amp;&amp; \\\n    apt-get install -y --no-install-recommends gcc python3-dev &amp;&amp; \\\n    pip install --no-cache-dir --upgrade pip &amp;&amp; \\\n    apt-get clean &amp;&amp; \\\n    rm -rf /var/lib/apt/lists/*\n\nWORKDIR /app\n\n# Copy requirements file\nCOPY code/requirements.txt .\n\n# Install Python packages\nRUN pip install --no-cache-dir -r requirements.txt\n\n# Copy requirements file\nCOPY docker_requirements.txt .\n\n# Install Python packages\nRUN pip install --no-cache-dir -r docker_requirements.txt\n\n# Stage 2: Runtime stage\nFROM python:3.13-slim\n\n# Apply security updates\nRUN apt-get update &amp;&amp; \\\n   apt-get install -y --no-install-recommends --only-upgrade \\\n       $(apt-get --just-print upgrade | grep \"^Inst\" | grep -i securi | awk '{print $2}') &amp;&amp; \\\n   apt-get clean &amp;&amp; \\\n   rm -rf /var/lib/apt/lists/*\n\nWORKDIR /app\n\n# Create a non-root user and set permissions\nRUN groupadd -r nlweb &amp;&amp; \\\n    useradd -r -g nlweb -d /app -s /bin/bash nlweb &amp;&amp; \\\n    chown -R nlweb:nlweb /app\n\nUSER nlweb\n\n# Copy application code\nCOPY code/ /app/\nCOPY static/ /app/static/\n\n# Copy installed packages from builder stage\nCOPY --from=builder /usr/local/lib/python3.13/site-packages /usr/local/lib/python3.13/site-packages\nCOPY --from=builder /usr/local/bin /usr/local/bin\n\n# Expose the port the app runs on\nEXPOSE 8000\n\n# Set environment variables\nENV NLWEB_OUTPUT_DIR=/app\nENV PYTHONPATH=/app\nENV PORT=8000\n\nENV VERSION=1.2.4\n\n# Command to run the application\nCMD [\"python\", \"app-file.py\"]\n\n\n\nKey aspects of the Dockerfile:\n\n\n\n\nStage 1 (Builder): Installs all dependencies and build tools\n\n\n\nStage 2 (Runtime): Creates a minimal runtime environment\n\n\n\nSecurity Features: Non-root user, security updates, minimal dependencies\n\n\n\nVersion Definition: ENV VERSION=1.2.4 defines the version that will be used for tagging\n\n\n\n\nGitHub Actions Workflow\n\n\n\nWhen changes are pushed to the iuneracustomizations branch and the Dockerfile is modified, the GitHub Actions CI/CD automation workflow in .github/workflows/prod-build.yml is triggered:\n\n\n\nname: prod-build\n\non:\n  push:\n    branches:\n      - iuneracustomizations\n    paths:\n      - Dockerfile\n\njobs:\n  build-and-push:\n    runs-on: ubuntu-latest\n    steps:\n      - name: Checkout Repository\n        uses: actions/checkout@v4\n\n      - name: Set up Docker Buildx\n        uses: docker/setup-buildx-action@v3\n\n      - name: Log in to Private Registry\n        uses: docker/login-action@v3\n        with:\n          username: ${{ secrets.DOCKERHUB_USERNAME }}\n          password: ${{ secrets.DOCKERHUB_TOKEN }}\n\n      - name: Set up QEMU\n        uses: docker/setup-qemu-action@v3\n      - name: Set up Docker Buildx\n        uses: docker/setup-buildx-action@v3\n\n      - name: Extract Version from Dockerfile\n        id: extract_version\n        run: |\n          # Extract the VERSION from Dockerfile\n          VERSION=$(grep \"ENV VERSION=\" Dockerfile | cut -d= -f2)\n          echo \"VERSION=${VERSION}\" >> $GITHUB_ENV\n          echo \"Using version from Dockerfile: ${VERSION}\"\n\n      - name: Build the Docker image\n        run: |\n          docker build -t iunera/nlweb:latest -t iunera/nlweb:${{ env.VERSION }} .\n          docker push iunera/nlweb:latest\n          docker push iunera/nlweb:${{ env.VERSION }}\n          echo \"Built and pushed Docker image with tags: latest, ${{ env.VERSION }}\"\n\n      - name: Inspect\n        run: |\n          docker image inspect iunera/nlweb:latest\n\n      - name: Create and Push Git Tag\n        run: |\n          git config --global user.name \"GitHub Actions\"\n          git config --global user.email \"actions@github.com\"\n          git tag -a v${{ env.VERSION }} -m \"Release version ${{ env.VERSION }}\"\n          git push origin v${{ env.VERSION }}\n\n\n\nThe workflow performs these steps:\n\n\n\n\nCheckout Repository: Clones the repository to the GitHub Actions runner\n\n\n\nSet up Docker Buildx: Configures Docker with multi-architecture build support\n\n\n\nLog in to Docker Hub: Authenticates with Docker Hub using repository secrets\n\n\n\nSet up QEMU: Enables building for multiple architectures (ARM64, AMD64)\n\n\n\nExtract Version: Parses the Dockerfile to extract the VERSION environment variable\n\n\n\nBuild and Push: Builds the Docker image with two tags (latest and the version number) and pushes both to Docker Hub\n\n\n\nInspect: Displays information about the built image for verification\n\n\n\nCreate Git Tag: Creates a Git tag for the version and pushes it to the repository\n\n\n\n\nComplete CI/CD to Deployment Flow\n\n\n\nThe complete flow from Dockerfile to deployment involves:\n\n\n\n\nDevelopment: A developer updates the Dockerfile, potentially changing the VERSION\n\n\n\nCI/CD: GitHub Actions builds and pushes the Docker image to Docker Hub\n\n\n\nAutomation: FluxCD detects the new image version in Docker Hub\n\n\n\nGitOps: FluxCD updates the Kubernetes manifests with the new image version and commits the changes back to the Git repository\n\n\n\nDeployment: FluxCD applies the changes to the Kubernetes cluster, creating new pods with the updated image\n\n\n\n\nThis GitOps approach ensures that:\n\n\n\n\nThe Git repository is the single source of truth\n\n\n\nAll changes are tracked and auditable\n\n\n\nDeployments are automated and consistent\n\n\n\nRollbacks are simple and reliable\n\n\n\n\nLocal Development Environment\n\n\n\nWhile the GitHub Actions workflow handles production builds, local development uses Docker Compose:\n\n\n\nservices:\n  nlweb:\n    build:\n      context: .\n      dockerfile: Dockerfile\n    container_name: nlweb\n    ports:\n      - \"8000:8000\"\n    env_file:\n      - ./code/.env\n    environment:\n      - PYTHONPATH=/app\n      - PORT=8000\n    volumes:\n      - ./data:/data\n      - ./code/config:/app/config:ro\n    healthcheck:\n      test: [\"CMD-SHELL\", \"python -c \\\"import urllib.request; urllib.request.urlopen('http://localhost:8000')\\\"\"]\n      interval: 30s\n      timeout: 10s\n      retries: 3\n      start_period: 10s\n    restart: unless-stopped\n    user: nlweb\n\n\n\nThis setup:\n\n\n\n\nUses the same Dockerfile as production\n\n\n\nMounts local directories for data and configuration\n\n\n\nLoads environment variables from a local .env file\n\n\n\nIncludes healthchecks for monitoring\n\n\n\nRuns as the non-root nlweb user\n\n\n\n\nThe combination of GitHub Actions for CI/CD and FluxCD for GitOps creates a robust and automated pipeline for building and deploying NLWeb, ensuring consistency between development and production environments.\n\n\n\nAzure Integration: Cloud-Native AI Infrastructure\n\n\n\nNLWeb&#8217;s integration with Azure services makes it an ideal choice for organizations already invested in Microsoft&#8217;s cloud ecosystem. The platform natively supports:\n\n\n\nAzure Cognitive Search: For vector search capabilities, NLWeb integrates with Azure&#8217;s vector search service, providing scalable and performant similarity search across large datasets.\n\n\n\nAzure OpenAI Service: Direct integration with Azure&#8217;s OpenAI offerings, including GPT-4 and embedding models, ensures enterprise-grade AI capabilities with proper governance and compliance.\n\n\n\nAzure Container Registry: Seamless integration with ACR for container image management and security scanning.\n\n\n\nThe configuration for Azure services is handled through environment variables and ConfigMaps, making it easy to manage different environments and maintain security best practices:\n\n\n\nenv:\n  - name: AZURE_VECTOR_SEARCH_ENDPOINT\n    value: \"https://your-vector-search-db.search.windows.net\"\n  - name: AZURE_OPENAI_ENDPOINT\n    value: \"https://your-openai-instance.openai.azure.com/\"\n\n\n\nProduction-Ready Features and Best Practices\n\n\n\nMulti-Provider LLM Support\n\n\n\nOne of NLWeb&#8217;s standout features is its support for multiple LLM providers, ensuring vendor independence and cost optimization. The platform supports:\n\n\n\n\nOpenAI: GPT-4.1 and GPT-4.1-mini models\n\n\n\nAnthropic: Claude-3-7-sonnet-latest and Claude-3-5-haiku-latest\n\n\n\nAzure OpenAI: Enterprise-grade OpenAI models with Azure&#8217;s security and compliance\n\n\n\nGoogle Gemini: chat-bison models for diverse AI capabilities\n\n\n\nSnowflake: Arctic embedding models and Claude integration\n\n\n\nHugging Face: Open-source models including Qwen2.5 series\n\n\n\n\nThis multi-provider approach allows organizations to:\n\n\n\n\nOptimize costs by using different models for different use cases\n\n\n\nEnsure service availability through provider redundancy\n\n\n\nExperiment with cutting-edge models without vendor lock-in\n\n\n\n\nPerformance Optimization and Caching\n\n\n\nIuneras Customizations of NLWeb implements sophisticated caching mechanisms to optimize performance and reduce API costs:\n\n\n\ncache:\n  enable: true\n  max_size: 1000\n  ttl: 0  # No expiration\n  include_schema: true\n  include_provider: true\n  include_model: true\n\n\n\nThe caching system considers multiple factors including schema, provider, and model when generating cache keys, ensuring accurate cache hits while maintaining response quality.\n\n\n\nEnterprise Data Integration\n\n\n\nBuilding on the foundation laid out in the comprehensive guide to exposing enterprise data with Java and Spring for AI indexing, NLWeb provides seamless integration with enterprise data sources. The platform supports:\n\n\n\n\nJSON-LD and Schema.org: Structured data integration for semantic web capabilities\n\n\n\nVector Database Integration: Support for various vector databases including Azure Cognitive Search\n\n\n\nReal-time Data Processing: Stream processing capabilities for dynamic content updates\n\n\n\nEnterprise Security: Role-based access control and data governance features\n\n\n\n\nDeployment Comparison: NLWeb vs Traditional Approaches\n\n\n\nFeatureNLWeb GitOpsAzure Web AppsTraditional Linux InstallScalabilityAuto-scaling with HPALimited vertical scalingManual scaling requiredDeployment SpeedAutomated via GitOpsManual deploymentManual configurationConfiguration ManagementGit-based versioningPortal-based settingsFile-based configurationMulti-environment SupportNative Kubernetes namespacesSeparate app instancesSeparate serversRollback CapabilitiesGit-based rollbacksLimited rollback optionsManual rollback processCost OptimizationResource-based pricingApp Service Plan pricingInfrastructure costsMonitoring &amp; ObservabilityKubernetes-native toolsAzure Monitor integrationCustom monitoring setupSecurityPod security contextsAzure security featuresManual security hardening\n\n\n\nThe iunera helm charts provide a significant advantage in this comparison, offering production-tested configurations that eliminate common deployment pitfalls.\n\n\n\nAdvanced Configuration Examples\n\n\n\nThis section provides practical, production-ready configuration examples for deploying NLWeb in various environments. These examples can be used as templates for your own deployments, with customization as needed for your specific requirements.\n\n\n\nNote: The following examples are organized by use case to help you find the most relevant configurations for your needs.\n\n\n\nComplete Helm Installation Manifest Examples\n\n\n\nBasic Development Setup\n\n\n\nFor development environments, here&#8217;s a minimal helm installation manifest:\n\n\n\napiVersion: helm.toolkit.fluxcd.io/v2beta1\nkind: HelmRelease\nmetadata:\n  name: nlweb-dev\n  namespace: nlweb-dev\nspec:\n  releaseName: nlweb-dev\n  targetNamespace: nlweb-dev\n  chart:\n    spec:\n      chart: nlweb\n      version: \">=1.1.0\"\n      sourceRef:\n        kind: HelmRepository\n        name: iunera-helm-charts\n        namespace: helmrepos\n  interval: 5m0s\n  install:\n    createNamespace: true\n  values:\n    replicaCount: 1\n    image:\n      repository: iunera/nlweb\n      tag: \"latest\"\n      pullPolicy: Always\n\n    env:\n      - name: NLWEB_LOGGING_PROFILE\n        value: development\n      - name: OPENAI_API_KEY\n        valueFrom:\n          secretKeyRef:\n            name: nlweb-secrets\n            key: openai-api-key\n\n    ingress:\n      enabled: true\n      annotations:\n        kubernetes.io/ingress.class: nginx\n      hosts:\n        - host: nlweb-dev.local\n          paths:\n            - path: /\n              pathType: ImplementationSpecific\n\n    resources:\n      requests:\n        cpu: 100m\n        memory: 512Mi\n      limits:\n        cpu: 500m\n        memory: 1Gi\n\n\n\nProduction-Ready Setup with Multi-Provider LLM Support\n\n\n\nFor production environments with comprehensive AI provider integration:\n\n\n\napiVersion: helm.toolkit.fluxcd.io/v2beta1\nkind: HelmRelease\nmetadata:\n  name: nlweb-prod\n  namespace: nlweb\nspec:\n  releaseName: nlweb\n  targetNamespace: nlweb\n  chart:\n    spec:\n      chart: nlweb\n      version: \">=1.1.0\"\n      sourceRef:\n        kind: HelmRepository\n        name: iunera-helm-charts\n        namespace: helmrepos\n  interval: 1m0s\n  install:\n    createNamespace: false\n  upgrade:\n    remediation:\n      retries: 3\n  values:\n    replicaCount: 3\n    image:\n      repository: iunera/nlweb\n      tag: \"1.2.4\"\n      pullPolicy: IfNotPresent\n\n    env:\n      - name: NLWEB_LOGGING_PROFILE\n        value: production\n      - name: AZURE_VECTOR_SEARCH_ENDPOINT\n        value: \"https://nlweb-prod.search.windows.net\"\n      - name: AZURE_VECTOR_SEARCH_API_KEY\n        valueFrom:\n          secretKeyRef:\n            name: nlweb-azure-secrets\n            key: vector-search-key\n      - name: OPENAI_API_KEY\n        valueFrom:\n          secretKeyRef:\n            name: nlweb-openai-secrets\n            key: api-key\n      - name: ANTHROPIC_API_KEY\n        valueFrom:\n          secretKeyRef:\n            name: nlweb-anthropic-secrets\n            key: api-key\n      - name: AZURE_OPENAI_API_KEY\n        valueFrom:\n          secretKeyRef:\n            name: nlweb-azure-openai-secrets\n            key: api-key\n\n    ingress:\n      enabled: true\n      annotations:\n        kubernetes.io/ingress.class: nginx\n        kubernetes.io/tls-acme: \"true\"\n        cert-manager.io/cluster-issuer: letsencrypt-prod\n        nginx.ingress.kubernetes.io/force-ssl-redirect: \"true\"\n        nginx.ingress.kubernetes.io/enable-modsecurity: \"true\"\n        nginx.ingress.kubernetes.io/enable-owasp-core-rules: \"true\"\n        nginx.ingress.kubernetes.io/rate-limit: \"100\"\n        nginx.ingress.kubernetes.io/rate-limit-window: \"1m\"\n      hosts:\n        - host: nlweb.example.com\n          paths:\n            - path: /\n              pathType: ImplementationSpecific\n      tls:\n        - secretName: nlweb-tls\n          hosts:\n            - nlweb.example.com\n\n    resources:\n      requests:\n        cpu: 200m\n        memory: 1Gi\n      limits:\n        cpu: 1000m\n        memory: 2Gi\n\n    autoscaling:\n      enabled: true\n      minReplicas: 3\n      maxReplicas: 10\n      targetCPUUtilizationPercentage: 70\n      targetMemoryUtilizationPercentage: 80\n\n\n\nComprehensive ConfigMap Customization Examples\n\n\n\nWeb Server Configuration for Different Environments\n\n\n\nDevelopment Environment ConfigMap:\n\n\n\nvolumes:\n  configMaps:\n    - name: nlweb-dev-config\n      mountPath: /app/config\n      data:\n        config_webserver.yaml: |-\n          port: 8000\n          static_directory: ../../\n          mode: development\n\n          server:\n            host: 0.0.0.0\n            enable_cors: true\n            cors_trusted_origins: \"*\"  # Allow all origins in dev\n            max_connections: 50\n            timeout: 60\n\n            logging:\n              level: debug\n              file: ./logs/webserver.log\n              console: true\n\n            static:\n              enable_cache: false  # Disable caching in dev\n              gzip_enabled: false\n\n\n\nProduction Environment ConfigMap:\n\n\n\nvolumes:\n  configMaps:\n    - name: nlweb-prod-config\n      mountPath: /app/config\n      data:\n        config_webserver.yaml: |-\n          port: 8000\n          static_directory: ../../\n          mode: production\n\n          server:\n            host: 0.0.0.0\n            enable_cors: true\n            cors_trusted_origins:\n              - https://nlweb.example.com\n              - https://api.example.com\n              - https://admin.example.com\n            max_connections: 200\n            timeout: 30\n\n            ssl:\n              enabled: true\n              cert_file_env: SSL_CERT_FILE\n              key_file_env: SSL_KEY_FILE\n\n            logging:\n              level: info\n              file: ./logs/webserver.log\n              console: false\n              rotation:\n                max_size: 100MB\n                max_files: 10\n\n            static:\n              enable_cache: true\n              cache_max_age: 86400  # 24 hours\n              gzip_enabled: true\n              compression_level: 6\n\n\n\nMulti-Provider LLM Configuration\n\n\n\nEnterprise LLM Setup with Fallback Providers:\n\n\n\nvolumes:\n  configMaps:\n    - name: nlweb-llm-config\n      mountPath: /app/config\n      data:\n        config_llm.yaml: |-\n          preferred_endpoint: azure_openai\n          fallback_strategy: round_robin\n\n          endpoints:\n            azure_openai:\n              api_key_env: AZURE_OPENAI_API_KEY\n              api_endpoint_env: AZURE_OPENAI_ENDPOINT\n              api_version_env: \"2024-12-01-preview\"\n              llm_type: azure_openai\n              models:\n                high: gpt-4o\n                low: gpt-4o-mini\n              rate_limits:\n                requests_per_minute: 1000\n                tokens_per_minute: 150000\n              retry_config:\n                max_retries: 3\n                backoff_factor: 2\n\n            openai:\n              api_key_env: OPENAI_API_KEY\n              api_endpoint_env: OPENAI_ENDPOINT\n              llm_type: openai\n              models:\n                high: gpt-4-turbo\n                low: gpt-3.5-turbo\n              rate_limits:\n                requests_per_minute: 500\n                tokens_per_minute: 90000\n\n            anthropic:\n              api_key_env: ANTHROPIC_API_KEY\n              llm_type: anthropic\n              models:\n                high: claude-3-opus-20240229\n                low: claude-3-haiku-20240307\n              rate_limits:\n                requests_per_minute: 300\n                tokens_per_minute: 60000\n\n            gemini:\n              api_key_env: GCP_PROJECT\n              llm_type: gemini\n              models:\n                high: gemini-1.5-pro\n                low: gemini-1.5-flash\n              rate_limits:\n                requests_per_minute: 200\n                tokens_per_minute: 40000\n\n\n\nEmbedding Provider Configuration for Vector Search\n\n\n\nMulti-Provider Embedding Setup:\n\n\n\nvolumes:\n  configMaps:\n    - name: nlweb-embedding-config\n      mountPath: /app/config\n      data:\n        config_embedding.yaml: |-\n          preferred_provider: azure_openai\n          fallback_providers:\n            - openai\n            - snowflake\n\n          providers:\n            azure_openai:\n              api_key_env: AZURE_OPENAI_API_KEY\n              api_endpoint_env: AZURE_OPENAI_ENDPOINT\n              api_version_env: \"2024-10-21\"\n              model: text-embedding-3-large\n              dimensions: 3072\n              batch_size: 100\n              rate_limits:\n                requests_per_minute: 1000\n\n            openai:\n              api_key_env: OPENAI_API_KEY\n              api_endpoint_env: OPENAI_ENDPOINT\n              model: text-embedding-3-large\n              dimensions: 3072\n              batch_size: 100\n              rate_limits:\n                requests_per_minute: 500\n\n            snowflake:\n              api_key_env: SNOWFLAKE_PAT\n              api_endpoint_env: SNOWFLAKE_ACCOUNT_URL\n              api_version_env: \"2024-10-01\"\n              model: snowflake-arctic-embed-l\n              dimensions: 1024\n              batch_size: 50\n              rate_limits:\n                requests_per_minute: 200\n\n            huggingface:\n              api_key_env: HF_TOKEN\n              model: sentence-transformers/all-mpnet-base-v2\n              dimensions: 768\n              local_inference: true\n              device: cpu\n\n\n\nPerformance Optimization Configuration\n\n\n\nHigh-Performance Caching Setup:\n\n\n\nvolumes:\n  configMaps:\n    - name: nlweb-performance-config\n      mountPath: /app/config\n      data:\n        config_llm_performance.yaml: |-\n          # LLM Performance Settings\n          representation:\n            use_compact: true\n            limit: 10\n            include_metadata: true\n\n          cache:\n            enable: true\n            max_size: 10000\n            ttl: 3600  # 1 hour\n            include_schema: true\n            include_provider: true\n            include_model: true\n            include_user_context: false\n            compression: gzip\n\n          rate_limiting:\n            enable: true\n            requests_per_minute: 1000\n            burst_size: 100\n            per_user_limit: 50\n\n          monitoring:\n            enable_metrics: true\n            metrics_port: 9090\n            health_check_interval: 30\n            performance_logging: true\n\n\n\nEnvironment-Specific Volume Configurations\n\n\n\nDevelopment with Hot Reloading:\n\n\n\nvolumes:\n  enabled: true\n  emptyDirs:\n    - name: data\n      mountPath: /app/data\n    - name: logs\n      mountPath: /app/logs\n    - name: tmp\n      mountPath: /tmp\n    - name: cache\n      mountPath: /app/cache\n\n  # Development: Use hostPath for easy file access\n  hostPaths:\n    - name: dev-config\n      hostPath: /local/dev/nlweb/config\n      mountPath: /app/config\n      type: DirectoryOrCreate\n\n\n\nProduction with Persistent Storage:\n\n\n\nvolumes:\n  enabled: true\n  emptyDirs:\n    - name: tmp\n      mountPath: /tmp\n      sizeLimit: 1Gi\n\n  pvc:\n    enabled: true\n    storageClass: fast-ssd\n    size: 50Gi\n    accessMode: ReadWriteOnce\n    mountPath: /app/data\n\n  # Production: Use ConfigMaps for configuration\n  configMaps:\n    - name: nlweb-prod-config\n      mountPath: /app/config\n    - name: nlweb-llm-config\n      mountPath: /app/config/llm\n    - name: nlweb-embedding-config\n      mountPath: /app/config/embedding\n\n  # Production: Use Secrets for sensitive data\n  existingSecrets:\n    - name: nlweb-api-keys\n      mountPath: /app/secrets\n      defaultMode: 0400\n\n\n\nStep-by-Step Helm Installation Guide\n\n\n\nPrerequisites Setup\n\n\n\nBefore deploying NLWeb, ensure you have the following prerequisites:\n\n\n\n1. Add the Iunera Helm Repository:\n\n\n\nhelm repo add iunera https://iunera.github.io/helm-charts/\nhelm repo update\n\n\n\n2. Create Namespace and Secrets:\n\n\n\n# Create namespace\nkubectl create namespace nlweb\n\n# Create secrets for API keys\nkubectl create secret generic nlweb-openai-secrets \\\n  --from-literal=api-key=\"your-openai-api-key\" \\\n  -n nlweb\n\nkubectl create secret generic nlweb-azure-secrets \\\n  --from-literal=vector-search-key=\"your-azure-search-key\" \\\n  --from-literal=openai-api-key=\"your-azure-openai-key\" \\\n  -n nlweb\n\n\n\n3. Install with Custom Values:\n\n\n\n# Create custom values file\ncat > nlweb-values.yaml", "datePublished": "2025-06-19T10:57:54+01:00", "dateModified": "2025-06-19T19:12:36+01:00", "url": "https://www.iunera.com/kraken/nlweb/nlweb-deployment-in-kubernetes-gitops-style-with-fluxcd/", "author": "Chris", "image": "https://www.iunera.com/wp-content/uploads/NLWeb-orchestrated-with-k8s.jpg", "articleSection": "Machine Learning and AI, NLWeb, Our Projects", "keywords": "FluxCD, git, Gitops, k8s, Kubernetes, machineLearning, NLweb"}}], "query_id": ""}

data: {"message_type": "complete"}