Skip to content Skip to sidebar Skip to footer

How To Use Expand In Snakemake When Some Particular Combinations Of Wildcards Are Not Desired?

Let's suppose that I have the following files, on which I want to apply some processing automatically using snakemake: test_input_C_1.txt test_input_B_2.txt test_input_A_2.txt test

Solution 1:

The expand function accepts a second optional non-keyword argument to use a different function from the default one to combine wildcard values.

One can create a filtered version of itertools.product by wrapping it in a higher-order generator that checks that the yielded combination of wildcards is not among a pre-established blacklist:

from itertools import product

deffilter_combinator(combinator, blacklist):
    deffiltered_combinator(*args, **kwargs):
        for wc_comb in combinator(*args, **kwargs):
            # Use frozenset instead of tuple# in order to accomodate# unpredictable wildcard orderiffrozenset(wc_comb) notin blacklist:
                yield wc_comb
    return filtered_combinator

# "B_1" and "C_2" are undesired
forbidden = {
    frozenset({("text", "B"), ("num", 1)}),
    frozenset({("text", "C"), ("num", 2)})}

filtered_product = filter_combinator(product, forbidden)

rule all:
    input:
        # Override default combination generator
        expand("test_output_{text}_{num}.txt", filtered_product, text=["A", "B", "C"], num=[1, 2])

rule make_output:
    input: "test_input_{text}_{num}.txt"
    output: "test_output_{text}_{num}.txt"
    shell:
        """
        md5sum {input} > {output}
        """

The missing wildcards combinations can be read from the configfile.

Here is an example in json format:

{
    "missing" :
    [
        {
            "text" : "B",
            "num" : 1
        },
        {
            "text" : "C",
            "num" : 2
        }
    ]
}

The forbidden set would be read as follows in the snakefile:

forbidden = {frozenset(wc_comb.items()) for wc_comb in config["missing"]}

Post a Comment for "How To Use Expand In Snakemake When Some Particular Combinations Of Wildcards Are Not Desired?"