Obviously, you only have to subtract the divisor from the dividend until the dividend is smaller than the divisor. The rest is remainder. Just as obviously, this could take a long time if you're dividing a billion by 1. The deal is to shift the divisor left until it can't be shifted any more without becoming larger than the dividend. Subtracting this larger value amounts to subtracting the original number x times, where x is the number determined by shifting left, according to the base you're working with.

When you reach that first limit, you make the subtraction, shift right, and do it all again. This gives you a rapid way to subtract the divisor a large number of times with a minimal number of operations.

In reality, microprocessors cannot do anything but logical operations, and add. Addition is nothing but an XOR with carry. Subtraction is obtained by adding complements. Multiplication is accomplished by adding, division by subtracting. Even higher operations are merely mixtures, with maybe some algorithmic shortcuts stuck in for efficiency.