2
votes

I was doing Singly-Linked List implementation and I remember Linus Torvalds talking about it here.

In a singly-linked list in order to remove a node we should have access to the previous node and then change the node it is currently pointing to.

Like this enter image description here

So any way we should have access to the previous node.

But Linus Torvalds removed the special case by using the idea of address in C. So that head also has the 'previous thing' which is the address of head which points to head. So he has used C's feature of pointer and address to remove special case.

The normal code with special case enter image description here

The code with special case becoming normal case enter image description here

I think this kind of special case removal in singly linked list cannot be done in python, because we don't have the concept of pointers (and hence I will not be able to go one step before head)? Am I right ?

4
Related, the case Linus tries to amplify walks a pointer-to-pointer through the list, addressing not just the previous node, but rather the pointer that points to the prospect node. No external pointer-to-node is used beyond head, Rather, a pointer-to-pointer is used instead. And fwiw, the samples posted will both "leak" the removed-node if nothing else points to it in the given example. They will be unreachable once the link is severed.WhozCraig

4 Answers

2
votes

Sure you can do this in Python. What he's saying is that you have some data structure that represents the list itself and points to the head of the list, and you manipulate that just as you would the pointer in a list item when you're dealing with the first list item.

Now Python is not C so the implementation would be different, but the principle applies. The list itself is not the same object as its first item, and list items should not have the same methods as the list as a whole, so it makes sense to use separate kinds of objects for them.

Both of them can, however, use an attribute of the same name (e.g. next) to point to the next item. So when you iterate through the list, and you are at the first item, the "previous" item is the list itself, and you are manipulating its next attribute if you need to remove the first item.

In the real world, of course, you would never write your own Python linked list class except as an exercise. The built-in list is more efficient.

1
votes

You cannot use Linus's specific trick in Python, because, as you well know, Python does not have pointers (as such) or an address-of operator. You can still, however, eliminate a special case for the list head by giving the list a dummy head node. You can do that as an inherent part of the design of your list, or you can do it on the fly just by creating an extra node and making it refer to the first data-bearing node as its next node. Either way, all the nodes you might want to delete are then interior nodes, not special cases.

1
votes

My first thought upon reading your question was: why would you want to build a singly linked list in python? Python offers a wealth of collection types and you can use these without having to worry about whether they are implemented as singly linked list, as double linked list or as some non-recursive data structure (which are usually easier to handle).

But the answer to your question is: Python allows of course to build a singly linked list. For example, the following code does just that:

class Node:
    def __init__(self, x, next):
        self.x = x
        self.next = next

    def __str__(self):
        return "<{}, {!s}>".format(self.x, self.next)

n = Node(1, None)
n = Node(2, n)
n = Node(3, n)
print(n)  # output: <3, <2, <1, None>>>

n.next.next = n.next.next.next
print(n)  # output: <3, <2, None>>

The difference to C is: we did not have to malloc() or work with pointers because python handles memory for us. Python has references instead of pointers, they are similar but much safer and easier to use.

However, before implementing a linked list, you should consider what your requirements are regarding your collection and maybe you can pick a good one from the built-ins or from the collections module.

0
votes

You need two levels of indirection to do it the way Linus suggests, but you can potentially do it in Python perhaps by having a reference to an object which stores a reference to an object or something of this sort (an index to an index?). That said, I don't think it maps so elegantly or efficiently to Python and it'd probably be quite wasteful to use an object just to represent a single link in a linked structure.

In Python's case I'd just do the additional branching to check for cases where you're removing from the head unless there's some trick I'm missing.

As for implementing linked lists yourself, I actually find many use cases where the standard libraries don't suffice. Here's one example:

enter image description here

... where the grid might have 10,000 cells. Most linked lists provided by standard libraries aren't optimized to store 10,000+ linked lists in the size of a 32-bit index per list since they're trying to provide interfaces that allow the linked list to be used in isolation (not using a separate backing data structure for storage like an array). Typically the most efficient use of a linked list is one that doesn't own memory or manage any resources. It's just linking data in an auxiliary fashion already allocated and managed in another data structure, like this for a 128-bit (16-byte) tree node in an n-ary tree where elements can be stored at any level of the hierarchy:

struct TreeNode
{
     int32 parent;       // parent index or -1 for no parent
     int32 first_child;  // first child of this node or -1
     int32 next_sibling; // next child for the parent node or -1
     int32 element;      // element data stored in this node or -1
                         // if no data is associated
};

So there's a lot of use cases for implementing your own linked lists and other linked structures which are significantly more efficient for a more narrowly-applicable use case (grid data structures, octrees, quad-trees, graphs, etc), but again, I don't think you can use this trick in languages that don't easily allow you to utilize two or more levels of pointer indirection. Python inherently only has one for objects -- same with Java and C#. You'd need something like a "reference to a reference to an object" or an "index to an index to an object" or "an index to an object reference to an object".

Also linked lists generally aren't so useful in languages that don't allow you to manage where everything is stored in memory since you can end up getting cache misses galore iterating through linked lists otherwise if each list node is fragmented in memory as would often be the case after a GC cycle, e.g. For linked lists to be really efficient as in the case of the Linux kernel requires you to be able to really have fine control over where each node resides in memory so that list traversal is actually mostly, if not entirely, just iterating through contiguous chunks of memory. Otherwise you're generally better-off using small arrays even if that implies linear-time removals and insertions to/from the middle.