Skip to main content

Posts

Showing posts from June, 2020

031: GPX file analysis with Spark3

Digital mobility comes with GPS navigation and tracking. We'll use Apache Spark for processing and analysing GPX files generated by a running watch. The daily mood Beside riding, I also like running in my free-time. Well, because of my actual life-style and health, a category of time investment that sits somewhere in between  leisure  and duty . As part of my running activities, I recently started with interval training on a 400m tartan track. I run 8x 1000m fast + 600m slow , for a total of about 1 hour workout plus 15 min warm-up and 15 min cool down. This is rather a traditional training method as compared to modern  High-intensity-interval-training  (HIIT), but I actually needed a convenient plan for improving my racing pace, without being too much concerned of potential pain and injury. I am equiped with a second-hand  Garmin Forerunner  watch, probably one of the most popular devices for capturing passable GPS/HR measures at an affordable price range of 100 to 200$ (dependin

030: ML Model deployment with Databricks

Machine Learning (ML) Deployment is one of the dark sides of both Data Science and Data Engineering. Mmanaged services like Databricks might help. The daily mood As already mentioned in a previous post , I have the privilege to shadow a starting Data lake project . The team already prepared data and built a first Machine Learning (ML) model for a specific use-case.  They are currently in the process of deploying the  scoring application and find this quite challenging. We are going to discuss why it is difficult, and of course how technology and automation may help. Big Data & ML adoption In 2009, the Knowledge Discovery in Data Minining (KDD) conference became a competition (KDD Cup) which reached the IT world with a  disrupting report  of lessons learnt in large scale ML projects. The industry just started to realize the rize of Big Data (ex. IoT) and the potential of ML for Business. In the following years,  Business Intelligence  (BI) organizations not only embraced digital

029: From SCM to DevOps with GitHub Actions

Evaluation of GitHub Actions as the potential alternative (replacement or complement) to Jenkins for Continuous Integration (CI) in our organization. The daily mood Beside spending time reporting on my activities, I have the opportunity to backup one of our team lead with mentoring a student for about one month. He has already worked in part-time for about 3 years in our organisation, as a Java developer. He is now looking for a short engagement within the architecture team.  We agreed on the goal to create a document that is collecting all required informations around GitHub, and eventually run a small PoC/showcase. Git Git is a distributed version-control system for tracking changes in source code during software development. It was crated by Linus Torvalds in 2005 for development of the Linux kernel. It is a free software under GNU license. There are Git clients available for my different OS and code editors. And like for HTTP server there are tons of hosting providers and distribu

028: Auto-deploy featured resources with Flux Kustomize

FluxCD GitOps operator supports Kustomize for dynamically featuring Kubernetes resources includind HelmRelease. This post is about auto-deployment. The daily mood Another week is over and I have made good progress on this project around Helm chart deployment automation. As described in my last posts, we've been using Flux Helm Operator for abstracting Helm installation, and Kustomize for building custom configurations.  Now we will look at how Flux can integrate with Kustomize. At the end we'll actually get a picture which is not common, indeed we'll use FluxCD GitOps operator in conjonction with Helm operator, but disconnected from each other. Like for a connected mode, the solution should allow to automatically and bi-directionnaly sync a configuration change either in the Git repo (Flux HelmRelease + Kustomize), or on the Kubernetes cluster (Released Helm charts). Requirements An empty Git repository Assets from my previous post on Flux Helm Operator + Kustomize A Kuber

027: Flux featuring Helm charts - Solution draft

FluxCD GitOps operator supports Kustomize for dynamically featuring Kubernetes resources includind HelmRelease. This post is about solution analysis. The daily mood My manager looked at my current activity on Helm deployment toolchain, and latest findings as described in my  previous post on Flux Helm Operator + Kustomize .  He seems to generally support my directions and approach, but also raised some concerns and questions that I shall answer in this solution draft. Context We currently operate hundreds of services in Kubernetes clusters . Without claiming to have a Microservice architecture, we generally try to design small-sized standalone components that communicate through HTTP. We also try to develop and test with agility ( Scrum ), create immutable deliveries ( Maven artifacts , Docker images , Helm charts ), use resilient infrastructure, operationalise for reliability. Our services are actually shared by a dozen of different teams, combined via logical stacks on purpose (ex.

026: Flux Helm Operator + Kustomize

FluxCD GitOps operator supports Kustomize for dynamically featuring Kubernetes resources including HelmRelease. This post is about parametrization. The daily mood I got in touch with another team mate, also well aware of, and willing to help on Helm deployments. He gave me some precious pointers on our current solution (based on Ansible) which allowed me some additional level of understanding. He also pointed out that following to the past PoC on Flux, HelmReleases were introduced as the future solution, but had the negative effect to make life even more complicated for developers, and configurations even more replicated by SRE. He suggested myself to study the possibility of using Kustomize to feature HelmReleases, so that replications can be avoided. In fact I had not yet considered Helm and Kustomize as a potential fit so far, except if we wanted to render Kubernetes resources first, then configure them as described in this article . I realised about how many different tools we were