Bits are a stream of binary numbers that computers use to store many types of data. It ultimately comes down to ones and zeros, whether you’re working with text, photos, or videos. Bitwise operators in Python allow you to alter individual bits of data at the most granular level.
In your Raspberry Pi project or elsewhere, you can utilize bitwise operators to build algorithms like compression, encryption, and error detection, as well as to control physical devices. With high-level abstractions, Python frequently isolates you from the underlying components. In practice, the overloaded flavors of bitwise operators are more common. However, you’ll be astonished by their idiosyncrasies if you work with them in their original form!
In this article, you’ll learn how to do the following:
• Bitwise operators in Python can be used to manipulate individual bits
• Binary data can be read and written in a platform-independent manner
• Bitmasks are used to compress data into a single byte
• In custom data types, overload Python bitwise operations
• Secret messages can be hidden in digital photographs
Bitwise Operators in Python: An Overview
The arithmetic, logical, and comparison operators are only a few of the operators available in Python. They can be thought of as functions that make use of a more concise prefix and infix syntax.
Python lacks postfix operators such as the increment (i++) and decrement (i—) operators often seen in C.
Bitwise operators are nearly identical in appearance across programming languages:
Operator Example | Meaning |
& a & b | Bitwise AND |
| a | b | Bitwise OR |
^ a ^ b | Bitwise XOR (exclusive OR) |
~ ~a | Bitwise NOT |
<< a << n | Bitwise left shift |
a >> n | Bitwise right shift |
They’re represented by strange-looking symbols rather than words, as you can see. This distinguishes them in Python as being slightly less verbose than you may expect. But, unfortunately, you wouldn’t be able to deduce their meaning by simply glancing at them.
If you’re coming from another programming language, like Java, you’ll notice that Python lacks the unsigned right shift operator (>>>), which is represented by three greater-than marks.
In Python, there is no such thing as a sign bit! This is due to the intrinsic representation of numbers in Python. The sign bit does not have a fixed position in Python since integers can have infinite bits.
The majority of bitwise operators are binary, which means they work with two operands, usually referred to as the left and right operands. Because it only accepts one operand, bitwise NOT (~) is the only unary bitwise operator.
There is a compound operator for each binary bitwise operator that performs an augmented assignment.
Operator | Sample | Equivalent to |
&= | a &= b | a = a & b |
|= | a |= b | a = a | b |
^= | a ^= b | a = a ^ b |
<<= | a <<= n | a = a << n |
>>= | a >>= n | a = a >> n |
These are abbreviations for updating the left operand while it is still in place.
That’s it for the bitwise operator syntax in Python! It’s now time to look at each of the operators’ options in more detail to see where they’re most beneficial and how you may utilize them. Before looking at two types of bitwise operators, the bitwise logical operators, and the bitwise shift operators, we’ll first do a quick review of the binary system.
What’s Binary?
Numbers can be represented in an unlimited number of ways. Various symbols, including Roman numerals and Egyptian hieroglyphs, have been established since ancient times. However, positional notation, which is efficient, flexible, and ideally suited for arithmetic, is used by most modern cultures.
The base of any positional system, which represents the number of digits available, is a noteworthy feature. People naturally prefer the base-ten number system, generally known as the decimal system, since it is more conducive for finger counting.
On the other hand, Computers treat data as a collection of numbers expressed in the binary system, which uses the base-two numeral system. These numbers are made up of only two digits, one and zero.
In the base-ten system, the binary number 100111002 is equivalent to 15610 . Because the decimal system has ten numerals—zero through nine—writing the same number in base ten takes fewer digits than writing it in base two.
Note that you can’t identify a numeral system only by looking at the digits of a number.
The decimal number 10110, for example, has only binary digits. However, its binary counterpart, 1012, which is comparable to 510, signifies a wholly different value.
Although the binary system takes up more storage space than the decimal system, it is significantly easier to implement in hardware. Although you’ll require more construction pieces, they’ll be easier to make and come in fewer varieties. That’s the equivalent of splitting down your code into smaller, more modular, and reusable chunks.
But, more crucially, the binary system is ideal for electronic equipment that converts digits into various voltage levels. You want to retain enough spacing between consecutive voltages because voltage loves to drift up and down due to various noise types. If not, the signal may get distorted.
You may make the system more reliable and noise resistant by using simply two states. Of course, you could also increase the voltage, but that would increase the power consumption, which you don’t want.
How Binary is Used in Computers
Any piece of information must first be broken down into numbers and then converted to the binary system before being duplicated in digital form. Plain text, for example, can be conceived of as a series of characters. You may give each character an arbitrary number or use an existing character encoding like ASCII, ISO-8859-1, or UTF-8.
Strings are represented in Python as arrays of Unicode code points. Call ord() on each of the characters to find out their ordinal values:
[ord(item) for item in "€uro"]
The numbers that arise uniquely identify text characters within the Unicode space, but they are displayed in decimal form.
You’d like to rewrite them in binary numbers as follows:
Character | Decimal Code Point | Binary Code Point |
u | 11710 | 11101012 |
€ | 836410 | 100000101011002 |
o | 11110 | 11011112 |
r | 11410 | 11100102 |
It’s worth noting that bit-length, or the number of binary digits, varies substantially amongst characters. The euro sign (€) takes up fourteen bits, while the remainder of the characters can fit on seven.
Note: In Python, you can check the bit-length of any integer value by using the following syntax:
(42).bit_length()
It would be interpreted as a literal floating-point with a decimal point if no parentheses surrounded the integer.
Bit lengths that vary are a concern. If you put those binary digits next to each other on an optical disc, for example, you’d end up with a long stream of bits with no apparent character boundaries:
100000101011001110101111001011011112
Designating fixed-length bit patterns for all characters is one approach to understanding how to interpret this information. The smallest unit of information in modern computing, an octet or byte, consists of eight bits that may store 256 different values.
To define binary code points in terms of bytes, you can pad them with leading zeros as follows:
Character | Decimal Code Point | Binary Code Point |
u | 11710 | 00000000 011101012 |
€ | 836410 | 00100000 101011002 |
o | 11110 | 00000000 011011112 |
r | 11410 | 00000000 011100102 |
Each character now consists of two bytes or 16 bits. Your original text nearly quadrupled in size, but it’s at least encoded correctly.
To find clear bit patterns for each character in a document, you can use Huffman coding or a more appropriate character encoding. UTF-8, for example, prioritizes Latin characters over symbols that are less likely to appear in an English text to conserve space:
len("€uro".encode("utf-8"))
The entire text needs six bytes to encode using the UTF-8 standard. Because UTF-8 is a superset of ASCII, the letters u, r, and o each take up one byte, whereas the euro symbol takes up three bytes:
for item in "€uro": print(item, len(item.encode("utf-8")))
Other sorts of data can be digitized in the same way that text can. Pixels make up raster images, and each pixel has filters that express color intensities as integers. Numbers relating to air pressure at a specific sample interval can be found in sound waveforms. Geometric shapes specified by their vertices and so on are used to create three-dimensional models.
Everything is a number at the end of the day.
Bitwise Logical Operators
Bitwise operators can be used to perform Boolean logic on individual bits. On a bit level, this is similar to utilizing logical operators like and, or, and not. Beyond that, there are parallels between bitwise and logical operators.
Bitwise operators can be used to evaluate Boolean expressions instead of logical operators. However, this is generally discouraged.
Only use bitwise operators for managing bits unless you have a good reason and know what you’re doing. Otherwise, it’s far too easy to make a mistake. In most circumstances, you’ll want to use integers as bitwise operators’ parameters.
Bitwise AND
The logical conjunction is performed by the bitwise AND operator (&) on the appropriate bits of its operands. It only yields a one when both bits are switched on for each pair of bits occupying the exact location in the two numbers:
The resulting bit pattern is the operator’s parameters intersected. When both operands are ones, the result is it has two bits turned on. A zero bit is present in at least one of the inputs in every other position.
This is the arithmetic equivalent of a product of two-bit values. When considering the numbers a and b, the bitwise AND can be calculated by multiplying their bits at each index I
When a one is multiplied by one, the result is one, but when a one is multiplied by zero, the result is always zero. You can also take the smaller of the two bits in each pair. When the bit lengths of the operands are mismatched, the shorter one is automatically padded to the left with zeros.
Bitwise OR
Logical disjunction is achieved using the bitwise OR operator (|). If at least one of the appropriate pairs of bits is switched on, it yields a one:
The bit pattern that results is a union of the operator’s arguments. If one is present in one of the operands, it has five bits turned on. Only a two-zero combination results in a zero in the final output.
It uses a combination of a sum and a product of bit values as its arithmetic. To find the bitwise OR of numbers a and b, multiply their bits at each index I.
Bitwise XOR
Unlike the bitwise AND, OR, and NOT operators, the bitwise XOR operator does not have a logical counterpart in Python. You may, however, emulate it by adding to the current operators:
def xorFunction(x, y): return (x and not y) or (not x and y)
It assesses two mutually exclusive conditions and informs you whether or not one of them has been met. An employee cannot be both a senior and a junior at the same time, for example. An employee cannot, on the other hand, be neither a junior nor a senior. It is necessary to make a decision.
Because it conducts exclusive disjunction on the bit pairs, the word XOR stands for “exclusive or.” To put it another way, every bit pair must have negative bit values to form a one.
Bitwise NOT
The bitwise NOT operator , the last of the bitwise logical operators, takes only one argument and is the sole unary bitwise operator. It flips all of a number’s bits to execute logical negation.
The bits that are inverted are a complement to one another, turning ones into zeros and zeros into ones. Thus, it can be stated as the subtraction of individual bit values from one in arithmetic terms.
While the bitwise NOT operator appears to be the most straightforward of them all, utilizing it in Python requires significant caution.
Python does not handle unsigned integers natively, even though there are ways to imitate them. That is, whether you provide one or not, all numbers have an implicit sign connected to them. When you conduct a bitwise NOT of any number, this appears:
~156
You obtain a negative number instead of the expected 9910. Once you understand the multiple binary number formats, the rationale for this will become evident. For the time being, the quick-fix approach is to use the bitwise AND operator:
~156 & 255
Bitwise Shift Operators
Bitwise shift operators are a different type of bit manipulation tool. They allow you to shift the bits around, which will come in handy later when designing bitmasks. They were once commonly employed to boost the speed of specific mathematical procedures.
Left Shift
The bitwise left shift operator (<<) shifts the bits of its first operand by the number of places provided in its second operand to the left. It also ensures that enough zero bits are inserted to fill the gap on the right edge of the new bit pattern:
Changing the value of a single bit to the left by one place doubles it. For example, following the shift, the bit will represent a four instead of a two. The outcome will be quadrupled if you move it two positions to the left. When you sum up all the bits in a number, you’ll find that it doubles with each shift in place:
Shifting bits to the left, in general, is equivalent to multiplying a number by a power of two, with an exponent equal to the number of locations shifted:
Because bit shifting is a single command, it can be done quickly and costs less to calculate than exponent or product, and it was once a popular optimization approach. In addition, compilers and interpreters, like Python’s, are now capable of optimizing your code in the background.
On paper, a left shift causes the bit pattern to lengthen by as many places as you shift it. Because of the way Python handles integers, this is also true in general. However, in most circumstances, you’ll want to limit a bit pattern’s length to a multiple of eight, which is the typical byte length.
If you’re working with a single byte, for example, moving it to the left will delete any bits that extend beyond the left boundary:
It’s like gazing through a fixed-length window at an infinite stream of bits. In Python, there are a couple of tricks that allow you to do this. You can use the bitwise AND operator to apply a bitmask, for example:
43 << 3 (43 << 3) & 255
Three places to the left of 3910 yields a number greater than the maximum value stored on a single byte. This is because it requires nine bits, whereas a byte only requires eight. So you can use a bitmask with the appropriate value to remove that one extra bit on the left.
If you want to keep more or fewer bits, you’ll have to adjust the mask value accordingly.
Right Shift
The bitwise right shift operator (>>) is similar to the left shift operator in that it pushes bits to the right by the specified number of places instead of moving them to the left. The rightmost portions are always left out:
You halve the underlying value every time you move one place to the right. Moving the same bit two positions to the right yields half the original value, and so on. When you sum up all the separate bits, you’ll notice that the number they represent follows the same pattern:
A fraction is formed by halving an odd number, such as 15710. The right shift operator automatically floors the output to get rid of it. It’s almost the same as dividing a floor by a factor of two:
The number of spaces shifted to the right corresponds to the exponent once more. To execute a floor division in Python, you can use a specific operator:
9 >> 1 # Bitwise right shift 9 // 2 # Floor division (integer division) 9 / 2 # Floating-point division
Even for negative values, the bitwise right shift operator and the floor division operator work similarly. On the other hand, the floor division allows you to choose any divisor, not simply a power of two. Using the bitwise right shift to improve the efficiency of some arithmetic divisions was a widespread practice.
Arithmetic vs. Logical
The bitwise shift operators are further divided into arithmetic and logical shift operators. While Python only allows you to conduct arithmetic shifts, it’s helpful to understand how other programming languages provide bitwise shift operators to avoid surprises.
Representations of Binary Numbers
When utilizing the bitwise negation (~) and the right shift operator (>>), you’ve seen firsthand how Python lacks unsigned data types. You’ve seen indications of Python’s unconventional approach to integer storage, which makes dealing with negative values difficult. To properly use bitwise operators, you must first understand the various binary representations of integers.
Unsigned Integers
You can choose whether to use the signed or unsigned version of a numeric type in computer languages like C. When you know you’ll never have to deal with negative values, unsigned data formats are preferable. You effectively increase the potential value range by assigning that extra bit, which would otherwise be used as a sign bit.
It also adds a layer of safety by raising the maximum limit before an overflow occurs. Overflows, on the other hand, only happen with fixed bit lengths. Therefore they’re irrelevant in Python, which doesn’t have them.
The ctypes module, as was previously stated, is the quickest method to get a taste of Python’s unsigned numeric types:
signed integers
A number’s sign can only be in one of two states. If you ignore 0 for a time, it can be positive or negative, which corresponds to the binary system nicely. However, there are a few other ways to express signed integers in binary, each with its own set of advantages and disadvantages.
The sign-magnitude, which naturally builds on top of unsigned integers, is probably the most straightforward. When a binary sequence is decoded as sign-magnitude, the most significant bit acts as a sign bit, while the remaining bits usually function.
Numbers with Floating Points
The IEEE 754 standard specifies the sign, exponent, and mantissa bits in a binary representation for real numbers.
The float data type in Python is the same as the double-precision type. However, it’s worth noting that some applications need more or fewer bits. The OpenEXR image format, for example, uses half-precision to represent pixels with a wide dynamic range of colors while maintaining a small file size.
Numbers with a fixed point
While floating-point numbers are ideal for technical computations, their lack of precision makes them unsuitable for monetary calculations. Some integers, for example, have only an infinite representation in binary despite having a finite representation in decimal notation. This frequently leads to a rounding error, which can add up over time:
0.1 + 0.2
In these situations, you should use Python’s decimal module, which implements fixed-point arithmetic and allows you to specify where the decimal point should be placed on a particular bit length. You can specify the number of digits you want to keep by saying something like:
from decimal import Decimal, localcontext with localcontext() as context: context.prec = 6 # Number of digits print(Decimal("235.456987") * 1)
It does, however, include all digits, not just fractions.
Interned Integers
Because values in that range are frequently used in CPython, very small integers between -510 and 25610 are interned in a global cache to improve efficiency. In practice, Python will always provide the same instance every time you refer to one of those values, which are singletons produced at interpreter startup:
x = 256 y = 256 x is y print(id(x), id(y), sep="\n")
Integers with a Fixed Precision
The integers you’re most likely to discover in Python use the C signed long data type. On a fixed number of bits, they employ the traditional two’s complement binary encoding. Your hardware platform, operating system, and Python language version will all influence the actual bit length.
Because most modern computers have 64-bit architecture, this translates to decimal integers between -263 and 263 – 1. In Python, you may check the maximum value of a fixed-precision integer by typing:
import sys sys.maxsize
Integers with Arbitrary Precision
Do you recall the popular K-pop song “Gangnam Style” from 2012, becoming a worldwide hit? The video was the first to reach a billion views on YouTube. The view counter soon overflowed due to the large number of people who had watched the video.
YouTube had no alternative but to switch from 32-bit signed integers to 64-bit signed integers.
That may seem like plenty of space for a view counter, but far more significant numbers are usually in real life, particularly in the scientific community. Python, on the other hand, can quickly deal with them:
from math import factorial factorial(42)
Integer to Binary Conversion
You can print a prepared string literal in Python that allows you to choose the number of leading zeros to display to reveal the bits that make up an integer value:
print(f"{42:b}") # Print 42 in binary print(f"{42:042b}") # Print 42 in binary on 42 zero-padded digits
Binary to Integer Conversion
Once you’ve got your bit string ready, you can use a binary literal to acquire its decimal representation.
In the case of dynamically produced bit strings, calling int() with two arguments is preferable:
int("101010", 4) int("cafe", 20)
Emulating the Sign Bit
Bin() on a negative integer simply prepends the minus sign to the bit string obtained from the positive value:
print(bin(-38), bin(38), sep="\n ")
Order of Bytes
There is no doubt about the order of bits in a single byte. Regardless of how they’re physically laid up in memory, the least significant bit will always be at index zero, and the most significant bit will always be at index seven. The bitwise shift operators require this consistency.
Data with more than one byte can be read from left to right, as in an English text, or from right to left, as in an Arabic text. However, there is no agreement on the byte order in multibyte data chunks. This is because computers see bytes in a binary stream in the same way humans see words in a sentence.
It makes no difference whether direction computers read bytes from as long as they follow the same set of rules everywhere. Unfortunately, different computer architectures take different techniques to data transfer, making it difficult to transmit data across them.
Bitmasks
A bitmask functions similarly to a graffiti stencil in that it prevents the paint from being sprayed on specific sections of a surface. It allows you to isolate the bits to apply a function to them selectively. The bitwise logical operators and the bitwise shift operators are two types of bitwise logical operators used in bitmasking.
Bitmasks can be seen in a variety of settings. An IP address, for example, the subnet mask, is a bitmask that aids in the extraction of the network address. A bitmask can be used to access pixel filters, which correspond to the red, green, and blue colors in the RGB model. A bitmask can also define Boolean flags that can then be packed into a bit field.
Bitmasks are used for a few different types of operations. We’ll take a look at a few of them right now.
Getting a Bit
You can apply the bitwise AND against a bitmask made of only one bit at the desired index to read the value of a particular bit at a certain position:
def getting_a_bit(value, bit_index): return value & (1 << bit_index) getting_a_bit(0b10000000, bit_index=5) getting_a_bit(0b10100000, bit_index=5)
Setting a Bit
Setting a bit is the same as receiving one. You use the same bitmask as previously, but this time instead of bitwise AND, you use the bitwise OR operator:
def bit_setting(value, bit_index): return value | (1 << bit_index) bit_setting(0b10000000, bit_index=6) bin(185)
The mask keeps all of the original bits while enforcing a binary one at the given index. Thus, the value of that bit would not have changed if it had previously been set.
Unsetting a Bit
To clarify things up a little, you’ll want to replicate all binary digits while enforcing zero at a single index. Using the same bitmask, you can obtain this effect, but this time inverted:
def bit_unsetting(value, bit_index): return value & ~(1 << bit_index) bit_unsetting(0b11111111, bit_index=4) bin(635)
In Python, using the bitwise NOT on a positive number always results in a negative value. While this is usually desired, it isn’t an issue because the bitwise AND operator is applied right away. It causes the mask to be converted to two’s complement representation, giving you the desired outcome.
Toggling a Bit
It’s sometimes useful to be able to flick a bit on and off now and then. That’s where the bitwise XOR operator comes in handy, as it can flip your bit like this:
def bit_toggling(value, bit_index): return value ^ (1 << bit_index) a = 0b10100000 for _ in range(5): a = bit_toggling(a, bit_index=7) print(bin(a))
Overloading of the Bitwise Operator
Integer numbers are the primary domain of bitwise operators. That’s where they’re most useful. They’ve also been encountered in a Boolean setting, where they’ve taken the place of logical operators. Some of Python’s operators have alternate implementations, and you can overload them with new data types.
Even though the proposal to overload Python’s logical operators was rejected, any bitwise operators can be given new meaning. As a result, many popular libraries and the public library make use of it.
Data Types -Builtin
The following built-in Python data types have bitwise operators:
set, frozenset(), dict, int, and bool since Python 3.9
Bitwise operators may perform set algebra operations like union, intersection, and symmetric difference and merge and update dictionaries.
cars = {"bmw", "mercedes", "harrier"} buses = {"double-decker", "minibus"} cars | buses cars & buses cars ^ buses cars - buses # Not a bitwise operator! cars |= buses # Python 3.9+, same as cars.update(buses)
Conclusion
Mastering Python bitwise operators provide you complete control over binary data manipulation in your projects. You now understand their syntax and flavors, as well as the data types that they support. You can also tailor their behavior to meet your specific requirements.
Your key takeaways in this article include:
- Bitwise operators in Python can be used to manipulate individual bits
- Binary data can be read and written in a platform-independent manner
- Bitmasks are used to compress data into a single byte
- In custom data types, overload Python bitwise operations
- Secret messages can be hidden in digital photographs
We also explored how computers express many types of digital data using the binary system. You learned about various typical ways to interpret bits, as well as how to get around Python’s absence of unsigned data types and its unique way of storing integer numbers in memory.
With this knowledge, you’ll be able to utilize binary data in your programming fully.