Using SModelS

SModelS can take SLHA or LHE files as input (see Basic Input). It ships with a command-line tool runSModelS.py, which reports on the SMS decomposition and theory predictions in several output formats.

For users more familiar with Python and the SModelS basics, an example code Example.py is provided showing how to access the main SModelS functionalities: decomposition, the database and computation of theory predictions.

The command-line tool (runSModelS.py) and the example Python code (Example.py) are described below.

Note

For non-MSSM (incl. non-SUSY) input models the user needs to write their own model.py file and specify which BSM particles are even or odd under the assumed Z2 (or similar) symmetry (see adding new particles). From version 1.2.0 onwards it is also necessary to define the BSM particle quantum numbers in the same file 1.

runSModelS.py

runSModelS.py covers several different applications of the SModelS functionality, with the option of turning various features on or off, as well as setting the basic parameters. These functionalities include detailed checks of input SLHA files, running the decomposition, evaluating the theory predictions and comparing them to the experimental limits available in the database, determining missing topologies and printing the output in several available formats.

Starting on v1.1, runSModelS.py is equipped with two additional functionalities. First, it can process a folder containing a set of SLHA or LHE files, second, it supports parallelization of this input folder.

The usage of runSModelS is:

[SModelS:pyhfInterface] WARNING could not set pytorch as the pyhf backend, falling back to the default. [SModelS:pyhfInterface] We however recommend that pytorch be installed.

runSModelS.py [-h] -f FILENAME [-p PARAMETERFILE] [-o OUTPUTDIR] [-d] [-t] [-C] [-V] [-c] [-v VERBOSE] [-T TIMEOUT]

options:
-h, --help

show this help message and exit

-f FILENAME, --filename FILENAME

name of SLHA or LHE input file or a directory path (required argument). If a directory is given, loop over all files in the directory

-p PARAMETERFILE, --parameterFile PARAMETERFILE

name of parameter file, where most options are defined (optional argument). If not set, use all parameters from smodels/etc/parameters_default.ini

-o OUTPUTDIR, --outputDir OUTPUTDIR

name of output directory (optional argument). The default folder is: ./results/

-d, --development

if set, SModelS will run in development mode and exit if any errors are found.

-t, --force_txt

force loading the text database

-C, --colors

colored output

-V, --version

show program’s version number and exit

-c, --run-crashreport

parse crash report file and use its contents for a SModelS run. Supply the crash file simply via ‘– filename myfile.crash’

-v VERBOSE, --verbose VERBOSE

sets the verbosity level (debug, info, warning, error). Default value is info.

-T TIMEOUT, --timeout TIMEOUT

define a limit on the running time (in secs).If not set, run without a time limit. If a directory is given as input, the timeout will be applied for each individual file.

A typical usage example is:

runSModelS.py -f inputFiles/slha/simplyGluino.slha -p parameters.ini -o ./ -v warning

The resulting output will be generated in the current folder, according to the printer options set in the parameters file.

The Parameters File

The basic options and parameters used by runSModelS.py are defined in the parameters file. An example parameter file, including all available parameters together with a short description, is stored in parameters.ini. If no parameter file is specified, the default parameters stored in smodels/etc/parameters_default.ini are used. Below we give more detailed information about each entry in the parameters file.

  • options: main options for turning SModelS features on or off

  • checkInput (True/False): if True, runSModelS.py will run the file check tool on the input file and verify if the input contains all the necessary information.

  • doCompress (True/False): turns mass compression on or off during the decomposition. (Note that the compression is only applied to prompt particles, with widths larger than the promptWidth parameter.)

  • computeStatistics (True/False): turns the likelihood computation on or off (see likelihood calculation). If True, the likelihoods L_BSM, L_SM and L_max are computed for the EM-type results.

  • testCoverage (True/False): set to True to run the coverage tool.

  • combineSRs (True/False): set to True to combine signal regions in EM-type results when covariance matrix or pyhf JSON likelihood is available. Set to False to use only the most sensitive signal region (faster!). Available v1.1.3 onwards for covariance matrices and v1.2.4 onwards for full likelihoods (using pyhf).

  • reportAllSRs (True/False): set to True to report all signal regions, instead of best signal region only. If True no signal region combination is performed. Available from v2.1.1 onwards. Beware, the output can be long.

  • combineAnas (list of results): list of analysis IDs to be combined. All the analyses are assumed to be fully uncorrelated, so use with caution! Available from v2.2.0 onwards. NB, due to issues with pyhf, for the time being it is advisable to use this feature only with combineSRs=False.

  • experimentalFeatures (True/False): set to True to enable experimental features that are not yet considered part of SModelS proper. Available from v2.1.1 onwards. Use with care.

  • particles: defines the particle content of the BSM model

  • model: pathname to the Python file that defines the particle content of the BSM model or to a SLHA file containing QNUMBERS blocks for the BSM particles (see Basic Input). The Python file can be given either in Unix file notation (“/path/to/model.py”) or as Python module path (“path.to.model”). Defaults to share.models.mssm which is a standard MSSM. See smodels/share/models folder for more examples. Directory name can be omitted; in that case, the current working directory as well as smodels/share/models are searched for.

  • promptWidth: total decay width in GeV above which decays are considered prompt, default is 1e-11; available v2.0 onwards. (nb default was 1e-8 in v2.0 and 2.1, changed to 1e-11 in v2.2)

  • stableWidth: total decay width in GeV below which particles are considered as (quasi)stable, default is 1e-25; available v2.0 onwards.

  • parameters: basic parameter values for running SModelS

  • sigmacut (float): minimum value for an element weight (in fb). Elements with a weight below sigmacut are neglected during the decomposition of SLHA files (see Minimum Decomposition Weight). The default value is 0.005 fb. Note that, depending on the input model, the running time may increase considerably if sigmacut is too low, while too large values might eliminate relevant elements.

  • minmassgap (float): maximum value of the mass difference (in GeV) for perfoming mass compression. Only used if doCompress = True

  • maxcond (float): maximum allowed value (in the [0,1] interval) for the violation of upper limit conditions. A zero value means the conditions are strictly enforced, while 1 means the conditions are never enforced. Only relevant for printing the output summary.

  • ncpus (int): number of CPUs. When processing multiple SLHA/LHE files, SModelS can run in a parallelized fashion, splitting up the input files in equal chunks. ncpus = 0 parallelizes to as many processes as number of CPU cores of the machine, negative values mean parallelization to number of CPU cores minus the absolute value of ncpus (but at least 1). Default value is 1. Warning: python already parallelizes many tasks internally.

  • path: the absolute (or relative) path to the database. The user can supply either the directory name of the database, or the path to the pickle file. Also http addresses may be given, e.g. https://smodels.github.io/database/official230. See the github database release page for a list of public database versions. Shorthand notations are available: path=official refers to the official database of your SModelS version, while path=latest refers to the latest availabe database release. The ‘+’ operator allows for extending the “official” or “latest” database with add-ons:

    • +fastlim: adds fastlim results (from early 8 TeV ATLAS analyses); from v2.1.0 onward

    • +superseded: adds results which were previously available but were superseded by newer ones; from v2.1.0 onward

    • +nonaggregated: adds analyses with non-aggregated SRs in addition to the aggregated results in CMS analyses; from v2.2.0 onward

    • +full_llhds: replaces simplified HistFactory statistical models by full ones in ATLAS analyses; from v2.3.0 onward (careful, this increases a lot the runtime!)

Examples are path=official+fastlim, path=official+nonaggregated, path=official+nonaggregated+full_llhds. Note that order matters: results are replaced in the specified sequence, so path=nonaggregated+official will fall back onto the official database with aggregated results. In principle, the add-ons can also be used alone, e.g. path=nonaggregated, though this is of little practical use. Finally, debug refers to a version of the database with extra information that is however not intended for usage by a regular user and only mentioned here for completeness.

  • analyses (list of results): set to [‘all’] to use all available results. If a list of experimental analyses is given, only these will be used. For instance, setting analyses = CMS-PAS-SUS-13-008,ATLAS-CONF-2013-024 will only use the experimental results from CMS-PAS-SUS-13-008 and ATLAS-CONF-2013-024. Wildcards (, ?, [<list-of-or’ed-letters>]) are expanded in the same way the shell does wildcard expansion for file names. So analyses = CMS leads to evaluation of results from the CMS-experiment only, for example. SUS selects everything containining SUS, no matter if from CMS or ATLAS. Furthermore selection of analyses can be confined on their centre-of-mass energy with a suffix beginning with a colon and an energy string in unum-style, like :13*TeV. Note that the asterisk behind the colon is not a wildcard. :13, :13TeV and :13 TeV are also understood but discouraged.

  • txnames (list of topologies): set to [‘all’] to use all available simplified model topologies. The topologies are labeled according to the txname convention. If a list of txnames are given, only the corresponding topologies will be considered. For instance, setting txnames = T2 will only consider experimental results for \(pp \to \tilde{q} + \tilde{q} \to (jet+\tilde{\chi}_1^0) + (jet+\tilde{\chi}_1^0)\) and the output will only contain constraints for this topology. A list of all topologies and their corresponding txnames can be found here Wildcards (*, ?, [<list-of-or’ed-letters>]) are expanded in the same way the shell does wildcard expansion for file names. So, for example, txnames = T[12]*bb* picks all txnames beginning with T1 or T2 and containg bb as of the time of writing were: T1bbbb, T1bbbt, T1bbqq, T1bbtt, T2bb, T2bbWW, T2bbWWoff

  • dataselector (list of datasets): set to [‘all’] to use all available data sets. If dataselector = upperLimit (efficiencyMap), only UL-type results (EM-type results) will be used. Furthermore, if a list of signal regions (data sets) is given, only the experimental results containing these datasets will be used. For instance, if dataselector = SRA mCT150,SRA mCT200, only these signal regions will be used. Wildcards (*, ?, [<list-of-or’ed-letters>]) are expanded in the same way the shell does wildcard expansion for file names. Wildcard examples are given above.

  • dataTypes dataType of the analysis (all, efficiencyMap or upperLimit). Can be wildcarded with usual shell wildcards: * ? [<list-of-or’ed-letters>]. Wildcard examples are given above.

  • printer: main options for the output format

  • outputType (list of outputs): use to list all the output formats to be generated. Available output formats are: summary, stdout, log, python, xml, slha.

  • stdout-printer: options for the stdout or log printer

  • printDatabase (True/False): set to True to print the list of selected experimental results to stdout.

  • addAnaInfo (True/False): set to True to include detailed information about the txnames tested by each experimental result. Only used if printDatabase=True.

  • printDecomp (True/False): set to True to print basic information from the decomposition (topologies, total weights, …).

  • addElementInfo (True/False): set to True to include detailed information about the elements generated by the decomposition. Only used if printDecomp=True.

  • printExtendedResults (True/False): set to True to print extended information about the theory predictions, including the PIDs of the particles contributing to the predicted cross section, their masses and the expected upper limit (if available).

  • addCoverageID (True/False): set to True to print the list of element IDs contributing to each missing topology (see coverage). Only used if testCoverage = True. This option should be used along with addElementInfo = True so the user can precisely identify which elements were classified as missing.

  • summary-printer: options for the summary printer

  • expandedSummary (True/False): set True to include in the summary output all applicable experimental results, False for only the strongest one.

  • slha-printer: options for the SLHA printer

    • expandedOutput (True/False): set True to print the full list of results. If False only the most constraining result and excluding results are printed.

  • python-printer: options for the Python printer

  • addElementList (True/False): set True to include in the Python output all information about all elements generated in the decomposition. If set to True the output file can be quite large.

  • addTxWeights (True/False): set True to print the contribution from individual topologies to each theory prediction. Available v1.1.3 onwards.

  • xml-printer: options for the xml printer

  • addElementList (True/False): set True to include in the xml output all information about all elements generated in the decomposition. If set to True the output file can be quite large.

  • addTxWeights (True/False): set True to print the contribution from individual topologies to each theory prediction. Available v1.1.3 onwards.

The Output

The results of runSModelS.py are printed to the format(s) specified by the outputType in the parameters file. The following formats are available:

In addition, when running over multiple files, a simple text output (summary.txt) is generated with basic information about the results for each input file. A detailed explanation of the information contained in each type of output is given in SModels Output.

Example.py

Although runSModelS.py provides the main SModelS features with a command line interface, users more familiar with Python and the SModelS language may prefer to write their own main program. A simple example code for this purpose is provided in examples/Example.py. Below we go step-by-step through this example code:

  • Define the input model. By default SModelS assumes the MSSM particle content. For using SModelS with a different particle content, the user must define the new particle content and set modelFile to the path of the model file (see particles:model in Parameter File).

from smodels.tools import runtime
# Define your model (list of BSM particles)
runtime.modelFile = 'smodels.share.models.mssm'
# runtime.modelFile = 'mssmQNumbers.slha'
  • Import the SModelS modules and methods. If the example code file is not located in the smodels installation folder, simply add “sys.path.append(<smodels installation path>)” before importing smodels. Set SModelS verbosity level.

from smodels.theory import decomposer
from smodels.tools.physicsUnits import fb, GeV, TeV
from smodels.theory.theoryPrediction import theoryPredictionsFor, TheoryPredictionsCombiner
from smodels.experiment.databaseObj import Database
from smodels.tools import coverage
from smodels.tools.smodelsLogging import setLogLevel
from smodels.particlesLoader import BSMList
from smodels.share.models.SMparticles import SMList
from smodels.theory.model import Model
import time
  • Set the path to the database URL. Specify which database to use. It can be the path to the smodels-database folder, the path to a pickle file or (starting with v1.1.3) a URL path.

def main(inputFile='inputFiles/slha/lightEWinos.slha', sigmacut=0.005*fb, database='official'):
    """
  • Load the model and set the path to the input file. Load BSM and SM particle content; specify the location of the input file (must be an SLHA or LHE file, see Basic Input) and update particles in the model.



    model = Model(BSMparticles=BSMList, SMparticles=SMList)
    # Path to input file (either a SLHA or LHE file)
#     lhefile = 'inputFiles/lhe/gluino_squarks.lhe'
    slhafile = inputFile
    # Set main options for decomposition
    sigmacut = sigmacut

    t0 = time.time()
    toplist = decomposer.decompose(model, sigmacut, doCompress=True, doInvisible=True, minmassgap=mingap)

    # Access basic information from decomposition, using the topology list and topology objects:
    print("\n Decomposition done in %1.2fs" %(time.time()-t0))
    print("\n Decomposition Results: ")
    print("\t  Total number of topologies: %i " % len(toplist))
    nel = sum([len(top.elementList) for top in toplist])
    print("\t  Total number of elements = %i " % nel)
    # Print information about the m-th topology:
    m = 2
    if len(toplist) > m:
        top = toplist[m]
        print("\t\t %i-th topology  = " % m, top, "with total cross section =", top.getTotalWeight())
        # Print information about the n-th element in the m-th topology:
        n = 0

output:

 Decomposition Results: 
	  Total number of topologies: 57 
	  Total number of elements = 18544 
		 2-th topology  =  [][2] with total cross section = ['8.00E+00 [TeV]:3.05E-01 [pb] (None, None)', '1.30E+01 [TeV]:5.21E-01 [pb] (None, None)']
		 0-th element from 2-th topology  =  [[],[[e-,nu]]]
			with final states = [N1, N1~] 
			with cross section = ['8.00E+00 [TeV]:4.03E-03 [pb] (-1000024, 1000022)', '1.30E+01 [TeV]:7.03E-03 [pb] (-1000024, 1000022)'] 
			and masses =  [[6.81E+01 [GeV]], [1.34E+02 [GeV], 6.81E+01 [GeV]]]
  • Load the experimental results to be used to constrain the input model. Here, all results are used:


Alternatively, the getExpResults method can take as arguments specific results to be loaded.

    listOfExpRes = database.getExpResults()

    t0 = time.time()
    # Print basic information about the results loaded.
    # Count the number of loaded UL and EM experimental results:
    nUL, nEM = 0, 0
    for exp in listOfExpRes:
        expType = exp.datasets[0].dataInfo.dataType
        if expType == 'upperLimit':

output:

 Loaded Database with 86 UL results and 40 EM results 
    print("\n Theory Predictions and Constraints:")
    rmax = 0.
    for expResult in listOfExpRes:
        predictions = theoryPredictionsFor(expResult, toplist, combinedResults=False )
        if not predictions:
            continue  # Skip if there are no constraints from this result
        print('\n %s ' % expResult.globalInfo.id)
        for theoryPrediction in predictions:
            dataset = theoryPrediction.dataset
            datasetID = theoryPrediction.dataId()
            mass = theoryPrediction.mass
            txnames = [str(txname) for txname in theoryPrediction.txnames]
            PIDs = theoryPrediction.PIDs
            print("------------------------")
            print("Dataset = ", datasetID)  # Analysis name
            print("TxNames = ", txnames)

output:

 ATLAS-SUSY-2015-06 
------------------------
Dataset =  SR5j
TxNames =  ['T1', 'T2']
Prediction Mass =  [[5.77E+02 [GeV], 6.99E+01 [GeV]], [5.77E+02 [GeV], 6.81E+01 [GeV]]]
Prediction PIDs =  [[[1000021, 1000022], [1000021, 1000022]], [[1000021, 1000023], [1000021, 1000022]], [[1000021, 1000022], [1000021, 1000022]]]
Theory Prediction =  1.30E+01 [TeV]:5.43E-06 [pb] (1000021, 1000021)
Condition Violation =  None
  • Get the corresponding upper limit. This value can be compared to the theory prediction to decide whether a model is excluded or not:

            print("Theory Prediction = ", theoryPrediction.xsection)  # Signal cross section

output:

UL for theory prediction =  1.79E+00 [fb]
  • Print the r-value, i.e. the ratio theory prediction/upper limit. A value of \(r \geq 1\) means that an experimental result excludes the input model. For EM-type results also compute the likelihood values. Determine the most constraining result:

            # Get the corresponding upper limit:
            print("UL for theory prediction = ", theoryPrediction.upperLimit)

            # Compute the r-value
            r = theoryPrediction.getRValue()
            print("r = %1.3E" % r)
            # Compute likelihoods for EM-type results:
            if dataset.getType() == 'efficiencyMap':
                theoryPrediction.computeStatistics()
                print('L_BSM, L_SM, L_max = %1.3E, %1.3E, %1.3E' % (theoryPrediction.likelihood(),

output:

r = 3.031E-03
L_BSM, L_SM, L_max = 7.167E-03, 7.215E-03, 7.215E-03
  • Print the most constraining experimental result. Using the largest r-value, determine if the model has been excluded or not by the selected experimental results:

                bestResult = expResult.globalInfo.id
            allPredictions.append(theoryPrediction)

    # Print the most constraining experimental result
    print("\nThe largest r-value (theory/upper limit ratio) is %1.3E" % rmax)

output:

The largest r-value (theory/upper limit ratio) is  1.204E+00
(The input model is likely excluded by CMS-SUS-13-006)
  • Select analyses. Using the theory predictions, select a (user-defined) subset of analyses to be combined:

    else:
        print("(The input model is not excluded by the simplified model results)")

    print("\n Theory Predictions done in %1.2fs" %(time.time()-t0))
    t0 = time.time()
    # Select a few results results for combination:
    combineAnas = ['ATLAS-SUSY-2013-11', 'CMS-SUS-13-013']
    selectedTheoryPreds = []
    for tp in allPredictions:
  • Combine analyses. Using the selected analyses, combine them under the assumption they are fully uncorrelated:

    # Make sure each analysis appears only once:
    expIDs = [tp.analysisId() for tp in selectedTheoryPreds]
    if len(expIDs) != len(set(expIDs)):
        print("\nDuplicated results when trying to combine analyses. Combination will be skipped.")
    # Only compute combination if at least two results were selected
  • Print the combination. Print the r-values and likelihood for the combination:

    elif len(selectedTheoryPreds) > 1:
        combiner = TheoryPredictionsCombiner(selectedTheoryPreds)
        combiner.computeStatistics()
        llhd = combiner.likelihood()

output:

Combined analyses: ATLAS-SUSY-2013-11,CMS-SUS-13-013
Combined r value: 2.183E-02
Combined r value (expected): 2.183E-02
Likelihoods: L, L_max, L_SM =  1.385E-02,  1.401E-02,  1.394E-02
  • Identify missing topologies. Using the output from decomposition, identify the missing topologies and print some basic information:

        lsm = combiner.lsm()
        print("\n\nCombined analyses:", combiner.analysisId())
        print("Combined r value: %1.3E" % combiner.getRValue())
        print("Combined r value (expected): %1.3E" % combiner.getRValue(expected=True))
        print("Likelihoods: L, L_max, L_SM = %10.3E, %10.3E, %10.3E\n" % (llhd, lmax, lsm))

    print("\n Combination of analyses done in %1.2fs" %(time.time()-t0))
    t0 = time.time()
    # Find out missing topologies for sqrts=13*TeV:
    uncovered = coverage.Uncovered(toplist, sqrts=13.*TeV)
    print("\n Coverage done in %1.2fs" %(time.time()-t0))
    # First sort coverage groups by label
    groups = sorted(uncovered.groups[:], key=lambda g: g.label)
    # Print uncovered cross-sections:
    for group in groups:
        print("\nTotal cross-section for %s (fb): %10.3E\n" % (group.description, group.getTotalXSec()))

    missingTopos = uncovered.getGroup('missing (prompt)')
    # Print some of the missing topologies:
    if missingTopos.generalElements:
        print('Missing topologies (up to 3):')
        for genEl in missingTopos.generalElements[:3]:
            print('Element:', genEl)
            print('\tcross-section (fb):', genEl.missingX)
    else:
        print("No missing topologies found\n")

output:

Total cross-section for missing topologies (fb):  1.120E+04


Total cross-section for missing topologies with displaced decays (fb):  0.000E+00


Total cross-section for missing topologies with prompt decays (fb):  1.474E+04


Total cross-section for topologies outside the grid (fb):  3.976E+03

Missing topologies (up to 3):
Element: [[[jet,jet]],[[l,nu]]]  (MET,MET)
	cross-section (fb): 1203.688445123887
Element: [[[jet,jet]],[[nu,ta]]]  (MET,MET)
	cross-section (fb): 600.2542979266833
Element: [[[jet,jet]],[[b,b]]]  (MET,MET)
	cross-section (fb): 531.534863034501

No displaced decays

It is worth noting that SModelS does not include any statistical treatment for the results, for instance, correction factors like the “look elsewhere effect”. Due to this, the results are claimed to be “likely excluded” in the output.

Notes:
  • For an SLHA input file, the decays of SM particles (or BSM Z2-even particles) are always ignored during the decomposition. Furthermore, if there are two cross sections at different calculation order (say LO and NLO) for the same process, only the highest order is used.

  • The list of elements can be extremely long. Try setting addElementInfo = False and/or printDecomp = False to obtain a smaller output.

  • A comment of caution is in order regarding naively using the highest \(r\)-value reported by SModelS, as this does not necessarily come from the most sensitive analysis. For a rigorous statistical interpretation, one should use the \(r\)-value of the result with the highest expected \(r\) (\(r_{exp}\)). Unfortunately, for UL-type results, the expected limits are often not available; \(r_{exp}\) is then reported as N/A in the SModelS output.

1

SLHA files including decay tables and cross sections, together with the corresponding model.py, can conveniently be generated via the SModelS-micrOMEGAS interface, see arXiv:1606.03834