Python: Space Complexity
Where there’s time, there’s also space!
Space complexity is the other side of the coin when determining the most efficient algorithm to use. Knowing how to measure the memory requirements of an algorithm can be as important figuring out the runtime. The process is similar to measuring time complexity.
Table of Contents:
- What is Space Complexity?
- How to Measure Space Complexity
- Why is it Important?
- Conclusion
What is Space Complexity?
There are two types of memory, RAM(Random Access Memory), and the second type of memory are from storage drives(hard drives, flash drives, solid-state drives, hybrid drives, etc…). RAM is fast memory that stores values temporarily, and isn’t persistent(if your pc turns off memory stored in RAM will be lost). Storage drive memory is much slower than RAM, however it can handle much more in terms of capacity. Storage drive memory is also persistent(if your pc shuts down, then your data is still there. The only way to get rid of that data is to explicitly delete it.
How to Measure Space Complexity:
When we want to know how much memory is consumed by an algorithim, we are referencing how much RAM is needed for that algorithim to run. If you’re trying to process 1TB of data, and your pc has 32GB of RAM, then you’ll have to process the data in chunks. Your processing power is limited to the size of your RAM, in this case 32GB. So if at most, you’re processing 32GB of data at any given time then your memory consumption will be 32GB, not 1TB.
Number values, string values, characters, and variable references are the three basic types of data which have a space complexity of O(1)
. When you want to model the space complexity of your algorithm, it’s important to note that we don’t look at the space complexity of the input(arguments). We look at estimating how the basic data types are allocated to memory depending on the size of the input.
For Example:
numbers_list=[1,5,7,7,5,1,4,3,7,8]
def get_total(numbers_list):
total = 0
for num in numbers_list:
total += num
return total print(get_total(numbers_list))
— Here we have our list of numbers that’s saved in our numbers_list variable.
— We have two variables: total
, and num(from our for loop)
.
— Our total
and num
variables have a constant space complexity of O(1)
since they are just numerical values.
— On a high level lets take a look at a simple function that generates a list of lists.
def generate_lists_of_lists(n):
table_list = []
for num in range(n):
row = []
for i in range(n):
row.append(i)
table_list.append(row)
return table_listprint(generate_lists_of_lists(10))
Output:
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]
— Here we’re creating a lists of lists that generates numbers between 0–(n-1).
— We can see that the result of our table_list
variable contains n^2
values because we have n
number of rows, and each of those rows have n
values. So the space complexity of this algorithm is O(n^2).
Class Object Space Complexity:
— For more complicated data structures like class objects it would look more like this.
class Monster:def __init__(self, name, power, weapon):
self.name = name #1
self.power = power #1
self.weapon = weapon #1
— Given these Monster class attributes are relatively small inputs of a single value, they will use O(1)
memory.
— The total space complexity is 1 + 1 + 1 = 3
, or O(3)
when simplified it’s O(1)
.
— To simplify, replace the constants(in this case, 3) by 1 and remove the gradual growing terms.
Why is it Important?
When working on algorithms, there may be cases where we have to trade memory for speed or vice-versa. Though there also may be opportunities in which balances both space and time efficiency. When looking at an algorithm and trying to figure out how to make adjustments to get the best performance, you need to analyze the bottlenecks of what's going on and where you can make improvements.
Knowing how to leverage the amount of memory needed to run an algorithm over a dataset at scale is very useful. This also allows us to figure out what kind of pc hardware we need for our system. How much storage space you need for your application which will depend on how much data you have to process. Or if the size of the data that you need processed is greater than the max capacity allotted in your RAM, you’d either need to buy more RAM or figure out how to process the data in chunks, or parallel process the data using something like Pyspark or Dask. One thing that differentiates space complexity and time complexity is that you measure the space complexity of Python objects, data structures(lists, tuples, dictionaries, variables etc…), whereas for time complexity you just measure the runtime of an algorithm.
Conclusion:
Learning how to conduct a space complexity analysis on your algorithm is another great way to figure out the right specs needed for your application. You might need to reconfigure how your data gets processed in memory, if your system memory is lower than the size of your data file and it may or may not come at the cost of speed. Sometimes how you chunk your data as it gets processed may also increase speed performance as well or you might need to consider parallel processing using tools like Spark or Dask. Hope you got a good dose of learning about space complexity, please leave a clap, follow, share, until we code again!
Related Content: