Multicore processors have two or more processors in the same integrated chip. Early on in practical applications, multiple cores were used independently of each other. Concurrency isn’t as much of an issue if cores are not working in tandem on the same problem. Supercomputers and high-performance computing (HPC) saw multiple cores first. One difference between multicore processors in HPC versus embedded systems is that embedded systems are subject to size constraints.
How are multicores used? The mainstream use of multi-core processors is in data centers and HPC. One example is the Texas Advanced Computing Center (TACC) at the University of Texas at Austin. TACC has a supercomputer (named “Stampede”) that uses Intel® Xeon Phi™ chips, computing more than 10 quadrillion operations per second at peak performance. (The Intel® Xeon Phi™ family of processors has up to 72 cores packed into one chip.)
Concurrency, Synchronization, and Parallelism
One of the biggest challenges with multicore processors is the issue of concurrency. Multiple cores working in parallel, doing the same jobs, requires programmers to think about whether one process or action happens before another one being handled by another core. Previously, a single CPU handled a program from start to finish, beginning with initialization. Issues like stack overflow could occur, but overflowing a memory stack that is shared between two cores is doubly more difficult to predict and avoid. Two cores are like two brains; two are better than one as double the work can get done, but what happens when there’s a conflict? In working with multiple cores, there are implications with memory, operating systems, interconnections between the cores, and even application software. The simple act of breaking up tasks so that all cores remain fully utilized, not to mention synchronized, can be a challenge if abstraction isn’t provided. Multiple clocks, memories, caches, and interrupts across multiple cores can create challenges in debugging, as well. We have come from simple 8-bit to 64-bit single core programming and debugging. Keeping track of what is in the registers and what happens as you step through a program has become immensely more complicated with separate cores.
As long as one core can do tasks that do not depend upon another core, processes can be performed rapidly without waiting. All cores can be fully utilized. There are a couple of ways to utilize multiple cores. You can have all cores doing the same task in parallel to get through a heap of processing faster, or you can have each core doing a series of tasks from start to finish. The dependencies of one core to produce a result for another core can create challenges. The point of having multiple cores is to get more done faster, but dependencies can defeat the purpose as one core waits on the results of another. Arranging the use of cores such that no core has to wait on another core brings a new level of optimization to programming with multiple cores. Shared resources with multiple cores behave in much the same way as implementing multiple-threads in a program, except if you want to stop all possibility of a shared memory allocation being overwritten, you have to consider all other cores that might be working with a shared resource, not just threads from the same core. Other complications might exist with items outside the cores, such as changing global data. Each core has an independent cache and one might be working with an older version of a global variable; this creates a new kind of problem as multiple cores can be out of sync with respect to the outside world at any given moment.
Fortunately, multiple cores have been around just long enough for tools to have been developed to handle issues that multiple cores create. The point is to be aware that when you are working with multiple cores, it’s like spreading the work of a single program over several processors. With multiple cores, the processes going on under the (one) hood are much more complicated than when working with a single processor.