You could use a number of queries that use Take
and Skip
, but that would add too many iterations on the original list, I believe.
Rather, I think you should create an iterator of your own, like so:
public static IEnumerable<IEnumerable<T>> GetEnumerableOfEnumerables<T>(
IEnumerable<T> enumerable, int groupSize)
{
// The list to return.
List<T> list = new List<T>(groupSize);
// Cycle through all of the items.
foreach (T item in enumerable)
{
// Add the item.
list.Add(item);
// If the list has the number of elements, return that.
if (list.Count == groupSize)
{
// Return the list.
yield return list;
// Set the list to a new list.
list = new List<T>(groupSize);
}
}
// Return the remainder if there is any,
if (list.Count != 0)
{
// Return the list.
yield return list;
}
}
You can then call this and it is LINQ enabled so you can perform other operations on the resulting sequences.
In light of Sam's answer, I felt there was an easier way to do this without:
- Iterating through the list again (which I didn't do originally)
- Materializing the items in groups before releasing the chunk (for large chunks of items, there would be memory issues)
- All of the code that Sam posted
That said, here's another pass, which I've codified in an extension method to IEnumerable<T>
called Chunk
:
public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source,
int chunkSize)
{
// Validate parameters.
if (source == null) throw new ArgumentNullException(nameof(source));
if (chunkSize <= 0) throw new ArgumentOutOfRangeException(nameof(chunkSize),
"The chunkSize parameter must be a positive value.");
// Call the internal implementation.
return source.ChunkInternal(chunkSize);
}
Nothing surprising up there, just basic error checking.
Moving on to ChunkInternal
:
private static IEnumerable<IEnumerable<T>> ChunkInternal<T>(
this IEnumerable<T> source, int chunkSize)
{
// Validate parameters.
Debug.Assert(source != null);
Debug.Assert(chunkSize > 0);
// Get the enumerator. Dispose of when done.
using (IEnumerator<T> enumerator = source.GetEnumerator())
do
{
// Move to the next element. If there's nothing left
// then get out.
if (!enumerator.MoveNext()) yield break;
// Return the chunked sequence.
yield return ChunkSequence(enumerator, chunkSize);
} while (true);
}
Basically, it gets the IEnumerator<T>
and manually iterates through each item. It checks to see if there any items currently to be enumerated. After each chunk is enumerated through, if there aren't any items left, it breaks out.
Once it detects there are items in the sequence, it delegates the responsibility for the inner IEnumerable<T>
implementation to ChunkSequence
:
private static IEnumerable<T> ChunkSequence<T>(IEnumerator<T> enumerator,
int chunkSize)
{
// Validate parameters.
Debug.Assert(enumerator != null);
Debug.Assert(chunkSize > 0);
// The count.
int count = 0;
// There is at least one item. Yield and then continue.
do
{
// Yield the item.
yield return enumerator.Current;
} while (++count < chunkSize && enumerator.MoveNext());
}
Since MoveNext
was already called on the IEnumerator<T>
passed to ChunkSequence
, it yields the item returned by Current
and then increments the count, making sure never to return more than chunkSize
items and moving to the next item in the sequence after every iteration (but short-circuited if the number of items yielded exceeds the chunk size).
If there are no items left, then the InternalChunk
method will make another pass in the outer loop, but when MoveNext
is called a second time, it will still return false, as per the documentation (emphasis mine):
If MoveNext passes the end of the collection, the enumerator is
positioned after the last element in the collection and MoveNext
returns false. When the enumerator is at this position, subsequent
calls to MoveNext also return false until Reset is called.
At this point, the loop will break, and the sequence of sequences will terminate.
This is a simple test:
static void Main()
{
string s = "agewpsqfxyimc";
int count = 0;
// Group by three.
foreach (IEnumerable<char> g in s.Chunk(3))
{
// Print out the group.
Console.Write("Group: {0} - ", ++count);
// Print the items.
foreach (char c in g)
{
// Print the item.
Console.Write(c + ", ");
}
// Finish the line.
Console.WriteLine();
}
}
Output:
Group: 1 - a, g, e,
Group: 2 - w, p, s,
Group: 3 - q, f, x,
Group: 4 - y, i, m,
Group: 5 - c,
An important note, this will not work if you don't drain the entire child sequence or break at any point in the parent sequence. This is an important caveat, but if your use case is that you will consume every element of the sequence of sequences, then this will work for you.
Additionally, it will do strange things if you play with the order, just as Sam's did at one point.