Some JS code.

After having spent quite a large amount of time and energy in machine learning and computational intelligence in the past years, I started to reflect recently on the pro and contra of the various technologies. I think I embrace these days every technology (language) and any platform without prejudices in the sense that I can value the good of each culture and don’t have the naive attitude that one can rule it all. There ain’t a magical platform for machine learning, there ain’t an all-embracing programming language or repository which can tackle any business problem. Every approach has good and bad, every culture (people, techniques, libraries etc) has its own virtues. Saying that CRAN is better than PyPi or Mathematica syntax better than R’s idiosyncracies is a black-and-white attitude which overlooks the complexity of business needs and the fast changing world we live in. So, my wisdom as a consultant is here; listen to your customers, keep learning and keep an open mind.

The four cultures I’m most familiar with are R, .Net, Python and Mathematica but you can replace any of these and you’ll likely end up with similar conclusions.

The cloud has changed everything we do. In a way, a lot of software has become language-agnostic. You can find for pretty much any language a cloud platform, tons of libraries, distributed solutions for any price and problem at hand. It doesn’t mean that they are all equally good but for the gross of the consultancy needs you can choose the language, the libraries you like and some provider out there to host it. The Wolfram Cloud will scale your Mathematica or C code. Azure will host pretty much anything these days. AWS can handle any Spark or Hadoop needs.

There are repositories for anything and everything. In need of some NodeJS package, you’ll find it. Some esoteric bio-engineering sequencing algorithm, the R package repository has it. The Mathematica world is backed up by solid (and expensive) experts delivering high-end, specialized extensions. The R world is afloat thanks to the libraries created in the universities and research labs. Millions of JavaScript devotees keep NPM alive day after day. According to some, programming amounts these days to finding the right libraries and knitting them together. Well, it’s somewhat extreme but there is truth in it; open source and other repositories contain pretty much everything one can wish for.

Interoperability has always, still is and always will be an issue. If you have a delightful algorithm or solution in Python, it’s unlikely that it works well with, say, XAML data visualization. If you have a Mathematica notebook solving the Fokker-Planck equation, you won’t integrate it without pain in your F# program. While JSON, XML, OData and all that have solved a large amount of data exchange problems, the same cannot be said of software integration. Someone out there should start thinking about something like ODBC for programming languages, a rosetta stone on a binary level.

Scripting is here to stay. We more and more talk to software aka scripting. Notebook interfaces and scripting interfaces have been present in some cultures from the very beginning (Python and Mathematica) but nowadays it’s universal. We mix markdown and code in notebooks, scripts ML nodes in Azure ML, knit HTML and R scripts, instantiate dataviz in F#. I’m unsure what precisely pushed this tendency but there it is; coding and interactivity is fun and universal. In some cases (like R) the cycle of testing and seeing the results is a highly productive mix. The need for compiling is sometimes secondary.

Open source has become ubiquitous. Even Microsoft’s .Net core is open source now. The lesser part is that community creation and dumping obsolete products conveniently goes via open sourcing.
The good part is the tremendous amount of knowledge and shared code one has access to. Machine learning (artificial intelligence, vision) in particular is wide open because it has been nourished for a long time by academics (in common and less common code formats).

Of course, large parts of businesses and corporate intellectual properties remain undisclosed. SAP bits, mainframes algorithms and whatnot remain largely in the same state as twenty years ago; closed, monolithic and obscure. To some extend this ensures that things like insurance data mining and financial mining remain immune to the rapid dispersion of sniffers and hackers. The brave new (open and cloud) world we live in is indeed a good thing in the hands of people with good intentions, much less so when looking at all forms of (increasingly very intelligent) abuse.