Data teams need to grow up

Thine solution shall have at least general architecture documentation. Thine solution shall log what it did. Thine solution shall test it’s own behaviour for sanity. Thine solution’s results shall be monitored automatically. Do these seem familiar? For any software developer, of course they do. They have been around now for two decades. For data scientists … Read more

Data science market during and after the COVID-quake, is there a hit, really?

It is not surprising that startups with promiseware are hit when the market gets risky; some investors who were pseudo-rich with a stock portfolio, have now been punched financially so hard they are closing their outbound money-streams to survive, and pull funding. But. Companies with strong working products (like grocery trade) realize they don’t need … Read more

Data leadership during a pandemic

Today Finland decided to close it’s borders, schools, cultural public mass gathering places such as libraries, the National Opera and such to protect the weak and the old from getting infected with a virus potentially deadly to them. The part of Prime Minister Sanna Marin’s speech, which caught a data-man’s ear, was that testing for … Read more

Automate your data quality monitoring

As is obvious, data-driven solutions built or run with crappy data will not create business value, at worst, they do actual damage to business or people. Solving the issue is not hard. Problem is that too large a portion of data science sprint deliveries are measured by quantity, not by quality of deliveries. “How many … Read more

Solutions built with crap data will be crap

This is obvious. And yet, teams go on a-building even after seeing the data to be too crap to answer the questions in mind. In the age of anyone with internet connection being able to get their AI terminology right, why do people still… …train machine learning models with crappy data? The model(s) will be … Read more

Dead, so young – Microsoft Azure Data Lake Analytics

First touch with “ADLA” was one of familiarity (lo, ye olde YARN and a DAG-implementation, I salute you) and a thought of “you guys will come up with something better, soon”. Why the skepticism? Well, the combination of SQL and C# gave programmability…but for who? IMHO: ADLA was mostly a quick-fix from Azure to serve … Read more

Advanced data services

We design and implement data assets on-site and in-cloud(s) for machine learning and AI; from data extractions, transformations and loading, automated (and on-demand) data quality audits, assessment and monitoring and statistics and multivariate modeling. See more in LinkedIn