j Slicing: primer | Jakub Tesárek
  • Home
  • Publications
  • Subscribe
  • Slicing: primer

    Slicing and indexing are two basic operations used to retrieve data from a list in Python. In the first part of this 3-part series we'll look at what slicing (and indexing) is and we'll go through practical examples that's you can use in your code. At the end we'll look at very powerful slice assignment.

    Indexing

    Indexing allows you to retrieve one element by its index.

    l = ['A', 'B', 'C', 'D']
    result = l[2] # result == 'C'

    If you know which element you want relative to the end of the list, you can access it by a negative index.

    l = ['A', 'B', 'C', 'D']
    result = l[-1] # result == 'D'

    If you try to access an index that doesn't exist in the list, Python will raise an exception.

    l = ['A', 'B', 'C', 'D']
    result = l[100] # IndexError: list index out of range

    Slicing

    Slicing is similar to indexing but instead of one element it allows you to retrieve multiple elements. It basically constructs new list and fills it with elements from the original list based on 3 criteria. Basic syntax looks like this my_list[start:end:step].

    Start is the index of first elements we want. Sometimes you can omit this value and Python will simply start at the first element.

    All elements up to end value will be included, except the element on that index itself. You can omit this value and Python will automatically assume you mean "until the end of the list".

    Step tells Python how many elements it should move in each step. By default, this value is 1 meaning it will include every element between start and end. Value of 2 means every 2nd element etc.

    l = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    result = l[1:3:1] # result = [1, 2]
    result = l[2:8:3] # result = [2, 5]

    When you use start, end or step that's outside of the range of the list, or your indexes mismatch, for example you start on later index than end: l[100:1:1], Python will not raise an exception. It will try to return something. If you for example use end that's outside the list, it will return all elements up to the actual end of the list. If your arguments didn't match any elements, you'll get an empty list.

    l = [0, 1, 2, 3]
    result = l[100:102] # result == []

    Let's look at some more examples. For the rest of this section we'll use a list defined like this l = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] and to make it easier to understand what exactly is going on, we'll visualize it like this:

    0 1 2 3 4 5 6 7 8 9

    First 3 values

    You can retrieve first 3 values like this l[:3]. We didn't use the start arguments as it's 0 by default and therefore unnecessary.

    0 1 2 3 4 5 6 7 8 9

    Last 3 values

    Similarly to previous example, you can get last three arguments using this code l[-3:]. We start 3 elements from the end and continue until the end of the list.

    0 1 2 3 4 5 6 7 8 9

    If we want, we can retrieve the values in reversed order l[:-4:-1].

    9 8 7 6 5 4 3 2 1 0

    Cut of ends

    Sometimes you might want the same list but without first and last value (or multiple): l[1:-1].

    0 1 2 3 4 5 6 7 8 9

    Every other value

    To select each other value, we can omit first two arguments, and simply do l[::2], this is equivalent to l[0:len(l):2], just shorter.

    0 1 2 3 4 5 6 7 8 9

    Reversing a list

    Python has very nice "trick" you can use to reverse a list: l[::-1]. It simply means: The first missing arguments make sure you use elements from the beginning to the end, the -1 reverses the direction in which the list is processed.

    In full, without skipping arguments it would look like this l[len(l):-(len(l)+1):-1].

    9 8 7 6 5 4 3 2 1 0

    Because strings in Python also support slicing, this trick is quite often used. Here's a simple function that detects if given string is a palindrome:

    def is_palindrome(s: str) -> bool:
    	return s == s[::-1]

    Copying a list

    Lists are mutable structures; if you add or remove elements in one part of code, it will change all references pointing to the same list. That's why you often need to copy a list to make sure, you can safely work with it (one of the principles of Defensive programming.

    Because we know that slicing produces new list, the easiest way to make a copy is simply l[:] which is equivalent to l[0:len(l):1]. We omitted all the arguments and just include single : to signal our intent to use slicing.

    0 1 2 3 4 5 6 7 8 9

    Surrounding a value

    Another trick that you'll use mostly with strings is finding surrounding elements. Imagine you're writing a function that searches for a string in another string. When presenting the data to the user, it's good idea to include some character before and after the find string.

    from typing import Optional
    
    def search_text(what: str, where: str, padding: int) -> Optional[str]:
    	""" Find first occurance of a substring.
    
    	If `what` is found, in `where` return it surrounded by ```padding```
    	number of characters.
    	"""
    	index = where.index(what)
    	if index is not None:
    		start = max(index - padding, 0) # Prevent negative values
    		end = index + padding + len(what)
    		return where[start:end]
    
    result = search_text('Wo', 'Hello World!', 3) # result == 'lo World'
    H e l l o W o r l d !
    H e l l o W o r l d !

    Batch processing

    Batch processing is very common software engineering problem. You have a list of items you need to process (eg. records that need to be stored in DB) and you need to split the items into smaller batches. One way to do it is to use slicing, and list comprehension. Let's look at the solution first and then I'll explain the details.

    from typing import List, Any
    
    def split_to_batches(l: List, batch_size: int) -> List[List[Any]]:
    	""" Split list to smaller lists """
    	return [l[i:i + batch_size] for i in range(0, len(l), batch_size)]
    
    l = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    batches = split_to_batches(l, 3) # batches == [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
    0 1 2 3 4 5 6 7 8 9
    0 1 2 3 4 5 6 7 8 9
    0 1 2 3 4 5 6 7 8 9
    0 1 2 3 4 5 6 7 8 9

    How does it work? At this point you should understand that l[:batch_size] returns batch_size numbers of items from the beginning. Similarly l[i:i + batch_size] returns batch_size number of items starting from i.

    We just need to generate i so that it starts at 0 and in each step, it increases by batch_size. That's what range(0, len(l), batch_size) does. Range generates an integer sequence. The arguments are very similar to slicing; first argument is the starting number, second is the maximum number and the last one is the step size. We use this here to create a sequence of [0, 3, 6, 9] up to the length of the original series.

    Then we just have to put all together using list comprehension. I recommend reading the documentation regarding this feature, it's bit out of scope of this article.

    Slice assignment

    Slice assignment is a tool that makes it possible to replace parts of one list with another list. The syntax looks like this:

    first_list[start:end:step] = second_list

    The part left of = specifies which elements will be removed, the part on right is an iterable providing new elements.

    l1 = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    l2 = ['A', 'B', 'C']
    l1[:3] = l2 # l1 = ['A', 'B', 'C', 3, 4, 5, 6, 7, 8, 9]

    l1:

    0 1 2 3 4 5 6 7 8 9

    l2:

    A B C

    resulting l1:

    A B C 3 4 5 6 7 8 9

    Slice assignment can change the size of origin list. If you assign more elements than you specify for replacement, the resulting list will be longer and vice versa.

    l1[-5:] = l2 # l1 = [0, 1, 2, 3, 4, 'A', 'B', 'C']
    0 1 2 3 4 A B C

    We can even use more advanced slicing constructs, for example:

    l1[:4:-2] = l2 # l1 = [0, 1, 2, 3, 4, 'C', 6, 'B', 8, 'A']
    0 1 2 3 4 C 6 B 8 A

    There's one caveat thought. If you use the third argument step, you cannot change the size of the resulting list; you have to match the size exactly otherwise you'll get ValueError.

    l1[::2] = l2
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: attempt to assign sequence of size 3 to extended slice of size 5

    Thank you for reading this introduction to slicing. In next part we'll look at more advanced topic - slicing multi-dimensional arrays. Make sure you don't miss it and subscribe.