Using SModelS

SModelS can take SLHA or LHE files as input (see Basic Input). It ships with a command-line tool runSModelS.py, which reports on the SMS decomposition and theory predictions in several output formats.

For users more familiar with Python and the SModelS basics, an example code Example.py is provided showing how to access the main SModelS functionalities: decomposition, the database and computation of theory predictions.

The command-line tool (runSModelS.py) and the example Python code (Example.py) are described below.

Note

For non-MSSM (incl. non-SUSY) input models the user needs to specify the BSM particles and their quantum numbers 1 (see adding new particles).

runSModelS.py

runSModelS.py covers several different applications of the SModelS functionality, with the option of turning various features on or off, as well as setting the basic parameters. These functionalities include detailed checks of input SLHA files, running the decomposition, evaluating the theory predictions and comparing them to the experimental limits available in the database, determining missing topologies and printing the output in several available formats.

Starting on v1.1, runSModelS.py is equipped with two additional functionalities. First, it can process a folder containing a set of SLHA or LHE files, second, it supports parallelization of this input folder.

The usage of runSModelS is:

[SModelS:pyhfInterface] WARNING could not set pytorch as the pyhf backend, falling back to the default. [SModelS:pyhfInterface] We however recommend that pytorch be installed.

runSModelS.py [-h] -f FILENAME [-p PARAMETERFILE] [-o OUTPUTDIR] [-d] [-t] [-C] [-V] [-c] [-v VERBOSE] [-T TIMEOUT]

options:
-h, --help

show this help message and exit

-f FILENAME, --filename FILENAME

name of SLHA or LHE input file or a directory path (required argument). If a directory is given, loop over all files in the directory

-p PARAMETERFILE, --parameterFile PARAMETERFILE

name of parameter file, where most options are defined (optional argument). If not set, use all parameters from smodels/etc/parameters_default.ini

-o OUTPUTDIR, --outputDir OUTPUTDIR

name of output directory (optional argument). The default folder is: ./results/

-d, --development

if set, SModelS will run in development mode and exit if any errors are found.

-t, --force_txt

force loading the text database

-C, --colors

colored output

-V, --version

show program’s version number and exit

-c, --run-crashreport

parse crash report file and use its contents for a SModelS run. Supply the crash file simply via ‘– filename myfile.crash’

-v VERBOSE, --verbose VERBOSE

sets the verbosity level (debug, info, warning, error). Default value is info.

-T TIMEOUT, --timeout TIMEOUT

define a limit on the running time (in secs).If not set, run without a time limit. If a directory is given as input, the timeout will be applied for each individual file.

A typical usage example is:

runSModelS.py -f inputFiles/slha/simplyGluino.slha -p parameters.ini -o ./ -v warning

The resulting output will be generated in the current folder, according to the printer options set in the parameters file.

The Parameters File

The basic options and parameters used by runSModelS.py are defined in the parameters file. An example parameter file, including all available parameters together with a short description, is stored in parameters.ini. If no parameter file is specified, the default parameters stored in smodels/etc/parameters_default.ini are used. Below we give more detailed information about each entry in the parameters file.

  • options: main options for turning SModelS features on or off

  • checkInput (True/False): if True, runSModelS.py will run the file check tool on the input file and verify if the input contains all the necessary information.

  • doCompress (True/False): turns mass compression on or off during the decomposition. (Note that the compression is only applied to prompt particles, with widths larger than the promptWidth parameter.)

  • testCoverage (True/False): set to True to run the coverage tool.

  • computeStatistics (True/False): turns the likelihood computation on or off (see likelihood calculation). If True, the likelihoods L_BSM, L_SM and L_max are computed for the EM-type results.

  • combineSRs (True/False): set to True to combine signal regions in EM-type results when covariance matrix or pyhf JSON likelihood is available. Set to False to use only the most sensitive signal region (faster!). Available v1.1.3 onwards for covariance matrices and v1.2.4 onwards for full likelihoods (using pyhf).

  • reportAllSRs (True/False): set to True to report all signal regions, instead of the best signal region only. From v3.0.0 onwards it will also include the combined SRs if combineSRs=True. Beware, the output can be long.

  • combineAnas (list of results): list of analysis IDs to be combined. All the analyses are assumed to be fully uncorrelated, so use with caution! Available from v2.2.0 onwards. NB, due to issues with pyhf, for the time being it is advisable to use this feature only with combineSRs=False.

  • experimentalFeatures (True/False): set to True to enable experimental features that are not yet considered part of SModelS proper. Available from v2.1.1 onwards. Use with care.

  • particles: defines the particle content of the BSM model

  • model: pathname to the Python file that defines the particle content of the BSM model or to a SLHA file containing QNUMBERS blocks for the BSM particles (see Basic Input). The Python file can be given either in Unix file notation (“/path/to/model.py”) or as Python module path (“path.to.model”). Defaults to share.models.mssm which is a standard MSSM. See smodels/share/models folder for more examples. Directory name can be omitted; in that case, the current working directory as well as smodels/share/models are searched for.

  • promptWidth: total decay width in GeV above which decays are considered prompt, default is 1e-11; available v2.0 onwards. (nb default was 1e-8 in v2.0 and 2.1, changed to 1e-11 in v2.2)

  • stableWidth: total decay width in GeV below which particles are considered as (quasi)stable, default is 1e-25; available v2.0 onwards.

  • ignorePromptQNumbers: list of quantum numbers to be ignored for promptly decaying particles (particles with width larger than promptWidth). Since many experimental searches are not sensitive to the properties of particles with prompt decays, SModelS has the option to erase the quantum numbers of these particles. For instance, if ignorePromptQNumbers=”spin,eCharge,colordim”, the spin, electric charge and color properties of promptly decaying particles will be ignored. This can greatly reduce the running time (must be used with caution). If this parameter is not defined, all quantum numbers will be kept. Available v3.0.0 onwards.

  • parameters: basic parameter values for running SModelS

  • sigmacut (float): minimum value for an SMS weight (in fb). SMS topologies with a weight below sigmacut are neglected during the decomposition of SLHA files (see Minimum Decomposition Weight). The default value is 0.005 fb. Note that, depending on the input model, the running time may increase considerably if sigmacut is too low, while too large values might eliminate relevant SMS topologies.

  • minmassgap (float): maximum value of the mass difference (in GeV) for perfoming mass compression. Only used if doCompress = True

  • maxcond (float): maximum allowed value (in the [0,1] interval) for the violation of upper limit conditions. A zero value means the conditions are strictly enforced, while 1 means the conditions are never enforced. Only relevant for printing the output summary.

  • ncpus (int): number of CPUs. When processing multiple SLHA/LHE files, SModelS can run in a parallelized fashion, splitting up the input files in equal chunks. ncpus = 0 parallelizes to as many processes as number of CPU cores of the machine, negative values mean parallelization to number of CPU cores minus the absolute value of ncpus (but at least 1). Default value is 1. Warning: python already parallelizes many tasks internally.

  • path: the absolute (or relative) path to the database. The user can supply either the directory name of the database, or the path to the pickle file. Also http addresses may be given, e.g. https://smodels.github.io/database/official230. See the github database release page for a list of public database versions. Shorthand notations are available: path=official refers to the official database of your SModelS version, while path=latest refers to the latest availabe database release. The ‘+’ operator allows for extending the “official” or “latest” database with add-ons:

    • +fastlim: adds fastlim results (from early 8 TeV ATLAS analyses); from v2.1.0 onward

    • +superseded: adds results which were previously available but were superseded by newer ones; from v2.1.0 onward

    • +nonaggregated: adds analyses with non-aggregated SRs in addition to the aggregated results in CMS analyses; from v2.2.0 onward

    • +full_llhds: replaces simplified HistFactory statistical models by full ones in ATLAS analyses; from v2.3.0 onward (careful, this increases a lot the runtime!)

Examples are path=official+fastlim, path=official+nonaggregated, path=official+nonaggregated+full_llhds. Note that order matters: results are replaced in the specified sequence, so path=nonaggregated+official will fall back onto the official database with aggregated results. In principle, the add-ons can also be used alone, e.g. path=nonaggregated, though this is of little practical use. Finally, debug refers to a version of the database with extra information that is however not intended for usage by a regular user and only mentioned here for completeness.

  • analyses (list of results): set to [‘all’] to use all available results. If a list of experimental analyses is given, only these will be used. For instance, setting analyses = CMS-PAS-SUS-13-008,ATLAS-CONF-2013-024 will only use the experimental results from CMS-PAS-SUS-13-008 and ATLAS-CONF-2013-024. Wildcards (, ?, [<list-of-or’ed-letters>]) are expanded in the same way the shell does wildcard expansion for file names. So analyses = CMS leads to evaluation of results from the CMS-experiment only, for example. SUS selects everything containining SUS, no matter if from CMS or ATLAS. Furthermore selection of analyses can be confined on their centre-of-mass energy with a suffix beginning with a colon and an energy string in unum-style, like :13*TeV. Note that the asterisk behind the colon is not a wildcard. :13, :13TeV and :13 TeV are also understood but discouraged.

  • txnames (list of topologies): set to [‘all’] to use all available simplified model topologies. The SMS topologies are labeled according to the txname convention. If a list of txnames are given, only the corresponding topologies will be considered. For instance, setting txnames = T2 will only consider experimental results for \(pp \to \tilde{q} + \tilde{q} \to (jet+\tilde{\chi}_1^0) + (jet+\tilde{\chi}_1^0)\) and the output will only contain constraints for this topology. A list of all SMS topologies and their corresponding txnames can be found here Wildcards (*, ?, [<list-of-or’ed-letters>]) are expanded in the same way the shell does wildcard expansion for file names. So, for example, txnames = T[12]*bb* picks all txnames beginning with T1 or T2 and containg bb as of the time of writing were: T1bbbb, T1bbbt, T1bbqq, T1bbtt, T2bb, T2bbWW, T2bbWWoff

  • dataselector (list of datasets): set to [‘all’] to use all available data sets. If dataselector = upperLimit (efficiencyMap), only UL-type results (EM-type results) will be used. Furthermore, if a list of signal regions (data sets) is given, only the experimental results containing these datasets will be used. For instance, if dataselector = SRA mCT150,SRA mCT200, only these signal regions will be used. Wildcards (*, ?, [<list-of-or’ed-letters>]) are expanded in the same way the shell does wildcard expansion for file names. Wildcard examples are given above.

  • dataTypes dataType of the analysis (all, efficiencyMap or upperLimit). Can be wildcarded with usual shell wildcards: * ? [<list-of-or’ed-letters>]. Wildcard examples are given above.

  • printer: main options for the output format

  • outputType (list of outputs): use to list all the output formats to be generated. Available output formats are: summary, stdout, log, python, xml, slha.

  • outputFormat: use to select in which format the output should be written. Available formats are: current (latest format) or version2 (SModelS 2.x format using bracket notation)

  • stdout-printer: options for the stdout or log printer

  • printDatabase (True/False): set to True to print the list of selected experimental results to stdout.

  • addAnaInfo (True/False): set to True to include detailed information about the txnames tested by each experimental result. Only used if printDatabase=True.

  • addSMSInfo (True/False): set to True to include detailed information about the SMS topologies generated by the decomposition. Only used if printDecomp=True.

  • printExtendedResults (True/False): set to True to print extended information about the theory predictions, including the PIDs of the particles contributing to the predicted cross section, their masses and the expected upper limit (if available).

  • addCoverageID (True/False): set to True to print the list of SMS IDs contributing to each missing topology (see coverage). Only used if testCoverage = True. This option should be used along with addSMSInfo = True so the user can precisely identify which SMS topologies were classified as missing.

  • summary-printer: options for the summary printer

  • expandedSummary (True/False): set True to include in the summary output all applicable experimental results, False for only the strongest one.

  • slha-printer: options for the SLHA printer

    • expandedOutput (True/False): set True to print the full list of results. If False only the most constraining result and excluding results are printed.

  • python-printer: options for the Python printer

  • addSMSList (True/False): set True to include in the Python output all information about all SMS topologies generated in the decomposition. If set to True the output file can be quite large.

  • addTxWeights (True/False): set True to print the contribution from individual topologies to each theory prediction. Available v1.1.3 onwards.

  • addNodesMap (True/False): set True to include the mapping of the nodes indices to the BSM labels. Available v3.0.0 onwards.

  • xml-printer: options for the xml printer

  • addSMSList (True/False): set True to include in the Python output all information about all SMS topologies generated in the decomposition. If set to True the output file can be quite large.

  • addTxWeights (True/False): set True to print the contribution from individual topologies to each theory prediction. Available v1.1.3 onwards.

The Output

The results of runSModelS.py are printed to the format(s) specified by the outputType in the parameters file. The following formats are available:

In addition, when running over multiple files, a simple text output (summary.txt) is generated with basic information about the results for each input file. A detailed explanation of the information contained in each type of output is given in SModels Output.

Example.py

Although runSModelS.py provides the main SModelS features with a command line interface, users more familiar with Python and the SModelS language may prefer to write their own main program. A simple example code for this purpose is provided in examples/Example.py. Below we go step-by-step through this example code:

  • Import the SModelS modules and methods. If the example code file is not located in the smodels installation folder, simply add “sys.path.append(<smodels installation path>)” before importing smodels. Set SModelS verbosity level.

from smodels.tools import coverage
from smodels.base.smodelsLogging import setLogLevel
from smodels.particlesLoader import load
from smodels.share.models.SMparticles import SMList
from smodels.base.model import Model
import time
setLogLevel("info")

# Set the path to the database
import os
  • Set the path to the database URL. Specify which database to use. It can be the path to the smodels-database folder, the path to a pickle file or (starting with v1.1.3) a URL path.

    BSMList = load()
  • Load the BSM particles. By default SModelS assumes the MSSM particle content. For using SModelS with a different particle content, the user must define the new particle content and set modelFile to the path of the model file (see particles:model in Parameter File).

    # Path to input file (either a SLHA or LHE file)
#     lhefile = 'inputFiles/lhe/gluino_squarks.lhe'
  • Load the model and set the path to the input file. Load BSM and SM particle content; specify the location of the input file (must be an SLHA or LHE file, see Basic Input) and update particles in the model.

#     model.updateParticles(inputFile=lhefile)

    sigmacut = sigmacut
    mingap = 5.*GeV
    # Decompose model
    topDict = decomposer.decompose(model, sigmacut,
    # Access basic information from decomposition, using the topology list and topology objects:
    print("\n Decomposition done in %1.2fm" %((time.time()-t0)/60.))
    print("\n Decomposition Results: ")
    print("\t  Total number of SMS = %i " % nSMS)

    # Get SMS topologies sorted by largest cross-section*BR:
    smsList = sorted(topDict.getSMSList(), 
                     key = lambda sms: sms.weightList, reverse=True)

output:

 Decomposition Results: 
	  Total number of topologies: 44 
	  Total number of SMS = 7882 
  • Print information about the SMS topologies from the decomposition:

    for sms in smsList[:3]:
        print(f"\t\t SMS  = {sms}")
        print(f"\t\t cross section*BR = {sms.weightList.getMaxXsec()}\n")

    # Load the experimental results to be used.
    # In this case, all results are employed.
    listOfExpRes = database.getExpResults()

output:

		 SMS  = (PV > C1+/C1-(1),N2(2)), (C1+/C1-(1) > N1/N1~,q,q), (N2(2) > N1,q,q)
		 cross section*BR = 7.33E-01 [pb]

		 SMS  = (PV > C1+/C1-(1),N2(2)), (C1+/C1-(1) > N1/N1~,q,c), (N2(2) > N1,q,q)
		 cross section*BR = 7.33E-01 [pb]

		 SMS  = (PV > C1+/C1-(1),C1+/C1-(2)), (C1+/C1-(1) > N1/N1~,q,q), (C1+/C1-(2) > N1/N1~,q,c)
		 cross section*BR = 4.94E-01 [pb]
  • Load the experimental results to be used to constrain the input model. Here, all results are used:

    # Count the number of loaded UL and EM experimental results:

Alternatively, the getExpResults method can take as arguments specific results to be loaded and used.

        elif expType == 'efficiencyMap':
            nEM += 1
    print("\n Loaded Database with %i UL results and %i EM results " % (nUL, nEM))

    # Compute the theory predictions for each experimental result and print them:
    print("\n Theory Predictions and Constraints:")
    rmax = 0.
    bestResult = None

output:

 Loaded Database with 102 UL results and 56 EM results 
        datasetID = theoryPrediction.dataId()
        txnames = sorted([str(txname) for txname in theoryPrediction.txnames])
        print("------------------------")
        print("Dataset = ", datasetID)  # Analysis name
        print("TxNames = ", txnames)
        print("Theory Prediction = ", theoryPrediction.xsection)  # Signal cross section
        print("Condition Violation = ", theoryPrediction.conditions)  # Condition violation values

        # Get the corresponding upper limit:
        print("UL for theory prediction = ", theoryPrediction.upperLimit)

output:

 Theory Predictions and Constraints:

 ATLAS-SUSY-2019-09 
------------------------
Dataset =  None
TxNames =  ['TChiWZoff']
Theory Prediction =  2.63E+00 [pb]
Condition Violation =  [0.0]
  • Get the corresponding upper limit. This value can be compared to the theory prediction to decide whether a model is excluded or not:

        print("r = %1.3E" % r)

output:

UL for theory prediction =  1.20E-01 [fb]
  • Print the r-value, i.e. the ratio theory prediction/upper limit. A value of \(r \geq 1\) means that an experimental result excludes the input model. For EM-type results also compute the likelihood values. Determine the most constraining result:

            theoryPrediction.computeStatistics()
            print('L_BSM, L_SM, L_max = %1.3E, %1.3E, %1.3E' % (theoryPrediction.likelihood(),
                    theoryPrediction.lsm(), theoryPrediction.lmax()))
        if r > rmax:
            rmax = r
            bestResult = theoryPrediction.analysisId()

output:

r = 2.888E+00
L_BSM, L_SM, L_max = 2.298E-09, 8.011E-03, 8.158E-03
  • Print the most constraining experimental result. Using the largest r-value, determine if the model has been excluded or not by the selected experimental results:

    # Print the most constraining experimental result
    print("\nThe largest r-value (theory/upper limit ratio) is %1.3E" % rmax)
    if rmax > 1.:
        print("(The input model is likely excluded by %s)" % bestResult)
    else:
        print("(The input model is not excluded by the simplified model results)")

    print("\n Theory Predictions done in %1.2fm" %((time.time()-t0)/60.))
    t0 = time.time()
    # Select a few results results for combination:

output:

The largest r-value (theory/upper limit ratio) is 4.783E+00
(The input model is likely excluded by ATLAS-SUSY-2019-09)
  • Select analyses. Using the theory predictions, select a (user-defined) subset of analyses to be combined:

        if expID not in combineAnas:
            continue
        if tp.likelihood() is None:
            continue
        selectedTheoryPreds.append(tp)
    # Make sure each analysis appears only once:
    expIDs = [tp.analysisId() for tp in selectedTheoryPreds]
    if len(expIDs) != len(set(expIDs)):
        print("\nDuplicated results when trying to combine analyses. Combination will be skipped.")
  • Combine analyses. Using the selected analyses, combine them under the assumption they are fully uncorrelated:

        lsm = combiner.lsm()
  • Print the combination. Print the r-values and likelihood for the combination:


    print("\n Combination of analyses done in %1.2fm" %((time.time()-t0)/60.))
    t0 = time.time()
    # Find out missing topologies for sqrts=13*TeV:

output:

Combined analyses: ATLAS-SUSY-2013-11,CMS-SUS-13-013
Combined r value: 2.053E-02
Combined r value (expected): 2.004E-02
Likelihoods: L, L_max, L_SM =  2.069E-03,  2.079E-03,  2.079E-03
  • Identify missing topologies. Using the output from decomposition, identify the missing topologies and print some basic information:

    # Print uncovered cross-sections:
    for group in groups:
        print("\nTotal cross-section for %s (fb): %10.3E\n" % (group.description, group.getTotalXSec()))

    missingTopos = uncovered.getGroup('missing (prompt)')
    # Print some of the missing topologies:
    if missingTopos.finalStateSMS:

output:

Total cross-section for missing topologies (fb):  1.062E+04


Total cross-section for missing topologies with displaced decays (fb):  0.000E+00


Total cross-section for missing topologies with prompt decays (fb):  1.402E+04


Total cross-section for topologies outside the grid (fb):  3.823E+03

It is worth noting that SModelS does not include any statistical treatment for the results, for instance, correction factors like the “look elsewhere effect”. Due to this, the results are claimed to be “likely excluded” in the output.

Notes:
  • For an SLHA input file, the decays of SM particles (or BSM Z2-even particles) are always ignored during the decomposition. Furthermore, if there are two cross sections at different calculation order (say LO and NLO) for the same process, only the highest order is used.

  • The list of SMS topologies can be extremely long. Try setting addSMSInfo = False and/or printDecomp = False to obtain a smaller output.

  • A comment of caution is in order regarding naively using the highest \(r\)-value reported by SModelS, as this does not necessarily come from the most sensitive analysis. For a rigorous statistical interpretation, one should use the \(r\)-value of the result with the highest expected \(r\) (\(r_{exp}\)). Unfortunately, for UL-type results, the expected limits are often not available; \(r_{exp}\) is then reported as N/A in the SModelS output.

1

SLHA files including decay tables and cross sections, together with the corresponding model.py, can conveniently be generated via the SModelS-micrOMEGAS interface, see arXiv:1606.03834