Early Computers
Early computers stored programs and data on punch cards. Most cards contained 80-characters, which is why early terminal programs were typically 80-characters per line. Punch cards are exactly what they sound like, physical cards. Each card is one line of a computer program or one piece of data. As users typed, a machine punched a hole representing a letter or number. Programs were a literal pile of cards, with the data after the program.

For example, if a user wanted to compute the average, median, high, and low figures in a set of data they would write a program on a set of cards telling the computer to analyze the data. Next, they would add the data cards physically on top of the program. The whole stack was then put into a special card reader that read the cards, one after another, then the data. Finally, the computer performed calculations and printed results.
Besides being clunky and loud and wasting a massive number of paper cards this system wasted an enormous amount of expensive computing power. Computers, which cost millions of dollars in the 1960s, sat idly around slowly reading a program. In the meantime, they could not do anything else until the card reading process finished. This problem was especially pronounced in Universities because many students shared one computer. Therefore, students waited in line while, in the meantime, the expensive computer wasn’t doing anything. Students and the computer were both idle.
Time-Sharing
In response, companies and University’s built computers and operating systems that did more than one thing at a time. To get around the card reader issue, these new computers used terminals. While one person was typing a program, using little processing power, another could be running their program, which requires more computing power.
The computer could still only do one thing at a time. However, by switching back and forth between tasks, it appeared to do multiple things at once. Rather than the CPU sitting idle it was almost always humming away doing something, whether waiting for keystrokes, compiling a program, or running a piece of software.
IBM OS/360, Unix, VMS
This new type of operating system, that supported multiple people doing multiple things at once, is called a time-sharing system. IBM released the first modern time-sharing operating system, the IBM OS/360, in 1966. Engineer Fred Brooks was lead engineer and wrote a seminal project management book about building the operating system, The Mythical Man Month.
Other systems soon followed including, notably, Multics and its successor, Unix. Later, DEC’s VMS operating system was also especially good at multitasking. Unix eventually morphed into Linux, powering today’s internet, and BSD, which became the core of Mac OS. Eventually, Microsoft hired away many VMS engineers to create Windows NT.
An evolution of timesharing remains the reason that servers, personal computers, and even phones can do multiple things at once.