Permuting items and remembering where one particular item ended up

Benjamin Hall

I want to do the following given a list of items (multiple choice test answers):

Create a series of permutations of the items.
For each permutation indicate the index of the item that was first in the input.

So for the following input

input = ["one", "two", "three"]

the output could look like

output = [(["two", "one", "three"], 1), (["one", "three", "two"], 0), (["two", "three", "one"], 2)...]

Using python, what's the least way of doing this?

I can permute them by doing

output = random.sample(input, k=len(input))

but that doesn't give me access to which one went where.

Tsaukpaetra

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

I want to do the following given a list of items (multiple choice test answers):

Create a series of permutations of the items.

For each permutation indicate the index of the item that was first in the input.

So you're making a test bank where the first answer is correct?

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

Using python,

Oh, NFC, good luck.

Benjamin Hall

@tsaukpaetra said in Permuting items and remembering where one particular item ended up:

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

I want to do the following given a list of items (multiple choice test answers):

Create a series of permutations of the items.

For each permutation indicate the index of the item that was first in the input.

So you're making a test bank where the first answer is correct?

Sort of--more a test generator. You feed it the test; for multiple choice questions the first is correct. It then generates a LaTex file (using a template) with a random permutation of the test (both questions and answers) and spits out a key for the multiple choice part (since that's annoying to do by hand).

It (hopefully) will also be able to group questions and let you provide variant questions where only one of the set gets in the test, picked randomly.

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

Using python,

Oh, NFC, good luck.

Do you know how to do it in another language? Maybe I can adapt the algorithm.

Gąska

Disclaimer: not much into Python.

If the answers have reference equality semantics, you can save the first object before shuffling, shuffle, then find the index of saved object. If the answers don't have reference equality, you can wrap them in trivial objects; this should give them that.

Benjamin Hall

@gąska They're strings. I was hoping to avoid the brute-force approach...

Gąska

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

@gąska They're strings.

So, no reference equality? Is value equality enough in your case?

I was hoping to avoid the brute-force approach...

Sometimes the simplest solution is the best.

Benjamin Hall

@gąska said in Permuting items and remembering where one particular item ended up:

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

@gąska They're strings.

So, no reference equality? Is value equality enough in your case?

Should be--there are only 3-5 of them at a time and if someone's dumb enough to put the same answer twice...

I was hoping to avoid the brute-force approach...

Sometimes the simplest solution is the best.

Sigh. You're probably right.

Tsaukpaetra

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

Do you know how to do it in another language? Maybe I can adapt the algorithm.

Depends on how efficient you want it to be. And if you care about super-optimum functions (I doubt it would matter since it's just text, and not a lot of it at that).

@gąska said in Permuting items and remembering where one particular item ended up:

If the answers have reference equality semantics, you can save the first object before shuffling, shuffle, then find the index of saved object. If the answers don't have reference equality, you can wrap them in trivial objects; this should give them that.

Something like this would be totally fine.

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

@gąska They're strings. I was hoping to avoid the brute-force approach...

Well, I would have loaded the bank questions into representative objects that had a flag of "this was the first one and therefore correct". I would have also had a bit more data to the import format to facilitate categorization and possible scoring value for partial-credit-type tests.

coderpatsy

I first thought to use a Fisher-Yates shuffle on the list of answers, and store when it swaps the first answer.

I don't know python but I do know JS:

Javascript code

function shuffleAnswers(arr) {
    var answerIndex = 0;
    var n = arr.length;
    var temp;
    var index;
    while (n > 0) {
        index = Math.floor(Math.random() * n);
        n--;
        if (index === answerIndex) {
            answerIndex = n;
        }
        temp = arr[n];
        arr[n] = arr[index];
        arr[index] = temp;
    }

    return [arr, answerIndex];
}

function createTestKey(answers, times) {
    var testAnswers = [];
    for (var i = 0; i < times; i++) {
        testAnswers.push(shuffleAnswers(answers.slice()));
    }

    return testAnswers;
}

Sample results

[
	[
		["one", "two", "three"],
		0
	],
	[
		["one", "two", "three"],
		0
	],
	[
		["three", "two", "one"],
		2
	],
	[
		["one", "three", "two"],
		0
	],
	[
		["one", "three", "two"],
		0
	]
]

cartman82

You could write custom shuffling code that remembers the first item, but since it sounds like you don't have performance constraints, gaska's solution is probably the best.

pie_flavor

@cartman82 gaska? who's thąt?

JBert

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

Should be--there are only 3-5 of them at a time and if someone's dumb enough to put the same answer twice...

Then do that check upfront to give an ID10T error.

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

@gąska said in Permuting items and remembering where one particular item ended up:

Sometimes the simplest solution is the best.

Sigh. You're probably right.

Like @Tsaukpaetra said it might be interesting to wrap each string with an object which knows if it was the correct answer or not.

But then regarding the brute force approach: given you have 5! number of permutations and at most 5 items to look at in each permuted array then that should be a maximum of 600 compare operations. And that's if you step through each array to the end, so I think you can lower it to 300 compare operations if you think of how 50% of the time your right answer will be near the start of the permuted answer array and the rest of the array doesn't need to be checked.

That's not such a big number so speedwise it shouldn't matter too much.

topspin

Something that usually works well with sorting (and permuting is basically just sorting for a random key) is the decorate-sort-undecorate pattern. (With respect to sorting, I think for more recent python versions using a key function for sort is preferred, though)
However, while I feel it's pretty clean, it doesn't really work out any better than the brute-force approach.

So here's three different ways to do it:

Decorate-Sort-Undecorate:
First, "decorate" the list of inputs with their indices, i.e. create a list [('one', 0), ('two', 1), ('three', 2)]. Next, shuffle that list. The items will now still know their original positions. Find the one with index 0 and undecorate again. Unfortunately, I couldn't think of a way that makes the finding part better than just searching through the list. Also, I googled for a less ugly way to basically do find_if, but that's all I got.

from random import shuffle
def shuffle_DSU(input):
    decorated = zip(input, range(len(input)))
    shuffle(decorated)    
    index = [i for i,x in enumerate(decorated) if x[1] == 0][0]
    undecorated = [el[0] for el in decorated]
    return (undecorated, index)

A different idea: Explicitly create the permutation indices.
Create a list [0, ..., n-1] representing the identity permutation, then shuffle that. The result represents your permutation. Now, create the output by taking the corresponding elements of the input and grab the first element of the permutation list. Applying the output is done with an explicit loop instead of a list comprehension here (my first idea of output = [input[i] for i in indices] creates the inverse permutation of what I want)

def shuffle_perm_indices(input):
    indices = range(len(input))
    shuffle(indices)
    output = [None]*len(input)
    for i, idx in enumerate(indices):
        output[idx] = input[i]
    return (output, indices[0])

None of this is really any better than the brute-force approach, though, which also just simply searches through the list once:

def shuffle_brute_force(input):
    reference_item = input[0]
    output = input[:]   # deep-copy before shuffling
    shuffle(output)
    index = [i for i,x in enumerate(output) if x == reference_item][0]
    return (output, index)

EDIT: Creating a list of several permutations is of course trivial from there.
Hope this helps.

japonicus

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

You feed it the test; for multiple choice questions the first is correct.

If only the position of the first answer matters then shift it from the list, shuffle the other items, then reinsert the saved answer at a known random position.

Adynathos

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

Create a series of permutations of the items.
For each permutation indicate the index of the item that was first in the input.

Python with NumPy:

import numpy as np

def perm_with_index(items):
	perm = np.random.permutation(items.__len__())
	
	return (
		[items[idx] for idx in perm],
		np.argmin(perm), # or add 1 here if you want 1-based index
	)

input = ["one", "two", "three"]
print(list(map(perm_with_index, [input]*3)))
# [(['one', 'three', 'two'], 0), (['three', 'one', 'two'], 1), (['one', 'two', 'three'], 0)]

Python without NumPy:

import random

def perm_with_index_nonumpy(items):
	perm = list(range(items.__len__()))
	random.shuffle(perm)
	
	return (
		[items[idx] for idx in perm],
		# or add 1 here if you want 1-based index
		# from: https://stackoverflow.com/questions/8534256/find-first-element-in-a-sequence-that-matches-a-predicate
		next(idx for idx, p in enumerate(perm) if p == 0), 
	)
	
input = ["one", "two", "three"]
print(list(map(perm_with_index_nonumpy, [input]*3)))
# [(['one', 'three', 'two'], 0), (['one', 'three', 'two'], 0), (['two', 'one', 'three'], 1)]

masonwheeler

This post is deleted!

Dragnslcr

@topspin said in Permuting items and remembering where one particular item ended up:

Something that usually works well with sorting (and permuting is basically just sorting for a random key) is the decorate-sort-undecorate pattern. (With respect to sorting, I think for more recent python versions using a key function for sort is preferred, though)
However, while I feel it's pretty clean, it doesn't really work out any better than the brute-force approach.

So here's three different ways to do it:

Decorate-Sort-Undecorate:
First, "decorate" the list of inputs with their indices, i.e. create a list [('one', 0), ('two', 1), ('three', 2)]. Next, shuffle that list. The items will now still know their original positions. Find the one with index 0 and undecorate again. Unfortunately, I couldn't think of a way that makes the finding part better than just searching through the list. Also, I googled for a less ugly way to basically do find_if, but that's all I got.

This was basically my first thought, as well. If you only care about where the one correct answer is, you could also make the list something like [("foo", True), ("bar", False), ("bah", False)], and then just do a simple loop like

for answer in list:
    if list[1]:
        // do stuff with correct answer

The most important thing to keep in mind is to avoid premature optimization. If this is code that's going to execute once a day, it doesn't matter if it takes 5 seconds to complete instead of 3 seconds. You're much better off sticking to simple constructs that will be easy for someone else (including future you) to read.

topspin

@adynathos argmin is a nice touch. Why do you call then .__len__ magic method directly instead of the len function?

Adynathos

@topspin said in Permuting items and remembering where one particular item ended up:

Why do you call then .len magic method directly instead of the len function?

I find it more logical: the length is a property of the object.
I use NumPy a lot, the (multidimensional) size of an ndarray is array.shape.
(not np.shape(array))

Benjamin Hall

Thanks everyone--

I decided to go a different direction (suggested by @Tsaukpaetra) and add "tags" to the input file to control things like grouping; this allows me to parse the questions and their answers into suitable objects that can hold their correctness-state.

So far I have the parser written. It's a horrible mess of chained ifs and checking for state, but I'll pretend it's a proper state machine. My first thought was to parse it with regular expressions............and then I realized that that would be a right

I didn't go with a more standard format because I want something that I can have another teacher send me (as a raw text file) and insert the grouping tags in by hand easily. I don't trust my fellow teachers to create anything more complicated than copying from Word into a raw text file. Or they can send their Word files to me and I'll do that.

Adynathos

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

Or they can send their Word files to me and I'll do that.

Have them write it in excel, each question/answer in its own cell.
Then load with https://openpyxl.readthedocs.io/en/stable/

Gąska

@adynathos said in Permuting items and remembering where one particular item ended up:

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

Or they can send their Word files to me and I'll do that.

Have them write it in excel, each question/answer in its own cell.

You wish!

JBert

@adynathos said in Permuting items and remembering where one particular item ended up:

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

Or they can send their Word files to me and I'll do that.

Have them write it in excel, each question/answer in its own cell.
Then load with https://openpyxl.readthedocs.io/en/stable/

Using Excel files as a data interchange format can seriously backfire if they type multiple answers into one cell, enter dates which get turned into formulas or in any other way do something which ticks off Excel.

They don't need any calculations done and they likely don't care about the subtleties of Excel so plain text might still be the best.

dkf

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

Using python, what's the least way of doing this?

Make a range that corresponds to the indices into the input list and permute that (using the code you already know). Finally, use that index permutation list to generate the result list and the index of the zero element. Easy.

Here's the sort of thing I mean:

def permute(input_list):
    indices = random.sample(range(len(input_list)), k=len(input_list))
    for result_index, idx in enumerate(indices):
        if idx == 0:
            return [input_list[i] for i in indices], result_index
    # Should be unreachable if input_list is non-empty
    raise Exception("empty input list?")

Appears to work when I test:

>>> permute([3,4,5,6,7,8,9])
([5, 9, 4, 3, 6, 8, 7], 3)
>>> permute([3,4,5,6,7,8,9])
([5, 3, 7, 9, 4, 8, 6], 1)
>>> permute([3,4,5,6,7,8,9])
([3, 5, 6, 7, 9, 4, 8], 0)
>>> permute([3,4,5,6,7,8,9])
([8, 7, 6, 4, 3, 5, 9], 4)

ben_lubar

@benjamin-hall said in Permuting items and remembering where one particular item ended up:

there are only 3-5 of them at a time

Finding an element using a linear search is O(n).

Shuffling an array is O(n), assuming your RNG is O(1).

Your RNG is probably several orders of magnitude slower than swapping two to four array elements (the last one doesn't need to get swapped).

Just go with the linear search.

Alternatively, make your array into pairs of string+bool and set the first bool to true and the rest to false.

Captain

@benjamin-hall Um, how are you permuting the things? With a permutation matrix?

Benjamin Hall

@captain said in Permuting items and remembering where one particular item ended up:

@benjamin-hall Um, how are you permuting the things? With a permutation matrix?

Built-in function random.shuffle. I don't need cryptographic randomness here.

Captain

@benjamin-hall Yeah, but the problem is that you're doing a shuffle and aren't keeping track of the permutation matrix that generated it.

For an list of length n, generate a list of of integers 1 to n. Shuffle the list of integers. Use the list of shuffled integers to rearrange the original list. Use the list of shuffled integers to track what went where.

But whatever, it looks like you got it figured out.