Parallel computing has gotten more and more common over the years. I think a large part of that is because nearly every device we use today has a multi-core processor. I really mean everything; even the Apple Watch has a dual-core processor. Naturally a lot of software has caught up with this trend and introduced some native parallelization into the mix. That’s part of the reason we at HarperDB like Node.js so much; there are common libraries that make parallelization a breeze. However, there are many software products and solutions out there that have not yet caught up to the parallel world and are still running single threaded. But I digress, let’s talk about some examples where parallel computing can really improve speed and cut costs…
Real-Time Big Data Analytics
Big data is complicated; just ask anyone who’s ever tried to work in that world. When it comes down to sheer number crunching you need serious computing power. Turning unstructured, inconsistent data into actionable business intelligence is nontrivial. In the past, our founders had taken the reigns of a multi-million dollar super computer to solve a problem like this. It worked, but it certainly wasn’t cheap. After this experience they started focusing more on parallelization and using multiple cloud instances connected together to solve their same problem, but at a fraction of the cost.
IoT Data Collection and Analysis
At HarperDB we are huge proponents of pushing processing to the edge. Utilizing edge devices to initially process data before sending it to the cloud for larger scale analysis is good. But what if we could remove the cloud altogether? What if we could run a massive query across the edge? This is parallel computing at a scale we’ve never seen. Granted, this is still a grand vision and we’re not quite there yet, but it would be hugely powerful. Either way, IoT data is a lot to process. Utilizing the edge to initially filter data, then have a cluster in the cloud to process and analyze data, can improve speed of aggregations.
Natural Language Processing (NLP)
By now you’ve probably noticed a pattern here. Parallelization is good for Big Data. NLP is yet another example of big data analysis. In this case, the task of processing text can be broken up into many smaller pieces and distributed across a network of servers. Typical NLP algorithms are single-threaded, since that’s how we tend to process language in real life. That said, there have been significant advances in parallel NLP algorithms that can crunch words much faster than ever before.
Medical Analytics
A notoriously difficult task in medicine is analyzing medical images. Processing and scanning these images is a complex and time consuming process. Dividing that processing up across multiple threads or servers can decrease the time required to produce results. There are plenty of other opportunities in medicine to apply parallel computing such as in DNA analysis, which is a computationally expensive task.
Smart Cities
Building a smart city is obviously a big and complex undertaking. The vision has been around for a while, but we’re just now starting to see the emergence of real smart city technology. Data is being recorded, but is it really being used? In a lot of cases it’s not and the data is commonly thrown away with no analysis performed. Smart cities already have an intelligent edge, the devices have the processing power to be used as a true parallel computing network. Running analytics directly on the edge could show a significant cost savings compared to standard cloud costs. Smart cities could become some of the most powerful parallel computing networks ever made.
These are certainly not all of the places where parallel computing could be utilized to increase speed and/or lower costs but just some thoughts we had on the subject. Parallel computing is possible everywhere as individual devices support the basic functionality to execute in parallel. Beyond that, utilizing a network of devices whether they’re servers or edge devices, can show real value in improving existing processes with parallelization.