Photo by Eleventh Wave on Unsplash

We use modern technologies every day. For better, mostly.

They became essential for us: we’re communicating, commuting, and collaborating by using cutting-edge technologies, consuming powers unthinkable 25 years ago: modern smartphone has peak performance like 10 times better than a supercomputer from the 1990s!

Additional computational power makes further advances possible. Think convolution neural networks adoption explosion — it happened mostly due to parallel computation framework & devices provided by nVidia.

That, in its turn, spiked ultra-wide adoption of other machine learning algorithms, making products that were never possible before: real time speech-to-text conversion, real time translation, contextual search, etc…


Every day I build containers locally. Many of us do — Docker is a crucial tool in the developer arsenal these days. However, when you build and maintain 1–2 containers — that’s great, but when you maintain an app built from 10 containers? 15? 20? Each local build, even with cache enabled, will take a significant amount of time. Multiply that by a number of daily rebuilds — and you’ll see hours of your precious time leaking away!

Sure, there’s CI/CD tools capable of parallel building, but what about local builds? …


Business photo created by kjpargeter — www.freepik.com

Typical ML model training workflow looks fairly standard these days:

  • Prepare data for training and validation
  • Choose architecture for the model
  • Pick initial parameters that make sense
  • Tweak the model and its parameters until you get a result that meets your requirements

Today I’d like to explore the last step: tweaking the model. There’s even a special name for this process: hyperparameters optimization, which is a relatively big field with existing tools and some competition between them.

However, if all you need is hyperparameters tuning, and you don’t need any bells or whistles on top, it’ll be fairly simple software…


One year ago I’ve posted an article showing how to build trivial sentence breaker and tokenizer in Java with DeepLearning4J. Recently, I’ve got a need to build similar model in Python, so I’ve decided to write a follow-up post, that’ll show DeepLearning4J -> Keras migration, explain some of the “DL4J vs Keras” nuances and highlight some of the issues Keras has at this moment.

Primary goal for this small project is to get model that is able to segment raw Russian/Ukrainian text into: independent sentences and independent tokens. On top of that, each individual token must get Part of Speech…


Recently I’ve started looking into computer vision problems, and image denoising/image reconstruction is one of pretty funny problems: it has more than one viable solution. So, I’ve decided to take a stab on it, and build basic image denoiser with neural network.

Luckily, I know that U-net architectures typically serve good for this kind of tasks. U-net idea is pretty close to autoencoders: you have feature extraction part of neural network, followed by reconstruction part. And since convolution/deconvolution layers are used all the way through the network — output pixels are not independent of each others.

This is how neural…


The microservices architecture allows you to do something, that was not really viable, or even previously possible — you can build applications with multiple languages at once — and that’s simply amazing! But what if you are restricted to a certain tech stack and languages? Or what if all you need is MVP (minimal viable product)? Or you don’t want to build/use heavy-weight ML systems? Or you just need something to make the aforementioned MVP quickly in your given stack and language?

Let’s consider the case of an MVP for a distributed Java application that uses Apache Kafka for message…


Doesn’t matter how many NLP frameworks/libraries/models are available out there, eventually you’ll need a language model that has no .zip made publicly available by some good guys. Just because whole life is a kind of N+1 problem. So, I’ve wrote small article with basic walk-through over the process of solving such a problem in easy way.

In such situation, I think, highest chances are you’ll need sentence breaker and/or tokenizer as first step. Sometimes it might be even the only step you need to make on your own. Sure, there are universal algorithms, like whitespace tokenizer, char-level representations, sub-word embeddings…


Blockchain is simple, and simplicity makes it useful in various fields. That was the idea i believe. Now i think blockchain failed its mission in this iteration.

Reason is simple: algorithm can’t be a religion, and blockchain definitely looks like a religion now. In search of product/market fit, there were so much promises blockchain might solve this or that problem, that masses bought that. That was easy, given how big the hype was — thanks to Bitcoin prices. So people tend to believe into Bitcoin/blockchain, and spread their beliefs among communities. That’s what any religion does, regardless of actual efficiency…


This post is a continuation of Performance of our models: parallelism post, and today we’ll talk how hardware properties affect performance of our models, and what we can choose for better perf. In general performance talks we could discuss network IO performance, or raid configurations for better throughput etc, but since we’re discussing performance of ML/DL models, we’re fairly limited to math performance.

And main hardware factors affecting performance are:

• Instructions sets & compiler optimizations available

• Memory bandwidth

Memory bandwidth:

These days you don’t really have too much choice here. Once you’ve decided which CPU will be used…


How model performance will improve if we’ll move model to <insert hardware name here>? This kind of questions is rather popular in our support channel when it comes to actual production deployment.

Luckily, there’s limited amount of factors affecting model performance: operations and algorithms used in a model, data dimensionality and hardware features available. Also there’s one mega-factor: potential benefits we’re able to gain from high-parallel execution. I.e. multi-core multi-cpu box, or CUDA, or MPI-like environment where our model will be serving us.

In this post we’ll take a quick look from parallelism standpoint.

Operation-level parallelism:

Our DL models operate…

raver119

Deep Learning Developer.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store