experiment package
Submodules
experiment.databaseObj module
- class experiment.databaseObj.Database(base=None, force_load=None, discard_zeroes=True, progressbar=False, subpickle=True, combinationsmatrix=None)[source]
Bases:
object
Database object. Holds a list of SubDatabases. Delegates all calls to SubDatabases.
- Parameters
base – path to the database, or pickle file (string), or http address. If None, “official”, or “official_fastlim”, use the official database for your code version (including fastlim results, if specified). If “latest”, or “latest_fastlim”, check for the latest database. Multiple databases may be specified using `+’ as a delimiter.
force_load – force loading the text database (“txt”), or binary database (“pcl”), dont force anything if None
discard_zeroes – discard txnames with only zeroes as entries.
progressbar – show a progressbar when building pickle file (needs the python-progressbar module)
subpickle – produce small pickle files per exp result. Should only be used when working on the database.
combinationsmatrix – an optional dictionary that contains info about combinable analyses, e.g. { “anaid1”: ( “anaid2”, “anaid3” ) } optionally specifying signal regions, e.g. { “anaid1:SR1”: ( “anaid2:SR2”, “anaid3” ) }
- createLinksToCombinationsMatrix()[source]
in all globalInfo objects, create a shallow link to the combinations matrix
- property databaseParticles
Database particles, a list, one entry per sub
- property databaseVersion
The version of the database, concatenation of the individual versions
- property expResultList
The combined list, compiled from the individual lists
- getExpResults(analysisIDs=['all'], datasetIDs=['all'], txnames=['all'], dataTypes=['all'], useSuperseded=False, useNonValidated=False, onlyWithExpected=False)[source]
Returns a list of ExpResult objects.
Each object refers to an analysisID containing one (for UL) or more (for Efficiency maps) dataset (signal region) and each dataset containing one or more TxNames. If analysisIDs is defined, returns only the results matching one of the IDs in the list. If dataTypes is defined, returns only the results matching a dataType in the list. If datasetIDs is defined, returns only the results matching one of the IDs in the list. If txname is defined, returns only the results matching one of the Tx names in the list.
- Parameters
analysisIDs – list of analysis ids ([CMS-SUS-13-006,…]). Can be wildcarded with usual shell wildcards: * ? [<letters>] Furthermore, the centre-of-mass energy can be chosen as suffix, e.g. “:13*TeV”. Note that the asterisk in the suffix is not a wildcard.
datasetIDs – list of dataset ids ([ANA-CUT0,…]). Can be wildcarded with usual shell wildcards: * ? [<letters>]
txnames – list of txnames ([TChiWZ,…]). Can be wildcarded with usual shell wildcards: * ? [<letters>]
dataTypes – dataType of the analysis (all, efficiencyMap or upperLimit) Can be wildcarded with usual shell wildcards: * ? [<letters>]
useSuperseded – If False, the supersededBy results will not be included (deprecated)
useNonValidated – If False, the results with validated = False will not be included
onlyWithExpected – Return only those results that have expected values also. Note that this is trivially fulfilled for all efficiency maps.
- Returns
list of ExpResult objects or the ExpResult object if the list contains only one result
- property pcl_meta
The meta info of the text version, a merger of the original ones
- property txt_meta
The meta info of the text version, a merger of the original ones
- class experiment.databaseObj.ExpResultList(expResList)[source]
Bases:
object
Holds a list of ExpResult objects for printout.
- Parameters
expResultList – list of ExpResult objects
- class experiment.databaseObj.SubDatabase(base=None, force_load=None, discard_zeroes=True, progressbar=False, subpickle=True, combinationsmatrix=None)[source]
Bases:
object
SubDatabase object. Holds a list of ExpResult objects.
- Parameters
base – path to the database, or pickle file (string), or http address. If None, “official”, or “official_fastlim”, use the official database for your code version (including fastlim results, if specified). If “latest”, or “latest_fastlim”, check for the latest database. Multiple databases may be named, use “+” as delimiter. Order matters: Results with same name will overwritten according to sequence
force_load – force loading the text database (“txt”), or binary database (“pcl”), dont force anything if None
discard_zeroes – discard txnames with only zeroes as entries.
progressbar – show a progressbar when building pickle file (needs the python-progressbar module)
subpickle – produce small pickle files per exp result. Should only be used when working on the database.
combinationsmatrix – an optional dictionary that contains info about combinable analyses, e.g. { “anaid1”: ( “anaid2”, “anaid3” ) } optionally specifying signal regions, e.g. { “anaid1:SR1”: ( “anaid2:SR2”, “anaid3” ) }
- property base
This is the path to the base directory.
- checkPathName(path, discard_zeroes)[source]
checks the path name, returns the base directory and the pickle file name. If path starts with http or ftp, fetch the description file and the database. returns the base directory and the pickle file name
- createBinaryFile(filename=None)[source]
create a pcl file from the text database, potentially overwriting an old pcl file.
- createLinksToCombinationsMatrix()[source]
in all globalInfo objects, create links to self.combinationsmatrix
- property databaseVersion
The version of the database, read from the ‘version’ file.
- fetchFromScratch(path, store, discard_zeroes)[source]
fetch database from scratch, together with description. :param store: filename to store json file.
- getExpResults(analysisIDs=['all'], datasetIDs=['all'], txnames=['all'], dataTypes=['all'], useSuperseded=False, useNonValidated=False, onlyWithExpected=False)[source]
Returns a list of ExpResult objects.
Each object refers to an analysisID containing one (for UL) or more (for Efficiency maps) dataset (signal region) and each dataset containing one or more TxNames. If analysisIDs is defined, returns only the results matching one of the IDs in the list. If dataTypes is defined, returns only the results matching a dataType in the list. If datasetIDs is defined, returns only the results matching one of the IDs in the list. If txname is defined, returns only the results matching one of the Tx names in the list.
- Parameters
analysisIDs – list of analysis ids ([CMS-SUS-13-006,…]). Can be wildcarded with usual shell wildcards: * ? [<letters>] Furthermore, the centre-of-mass energy can be chosen as suffix, e.g. “:13*TeV”. Note that the asterisk in the suffix is not a wildcard.
datasetIDs – list of dataset ids ([ANA-CUT0,…]). Can be wildcarded with usual shell wildcards: * ? [<letters>]
txnames – list of txnames ([TChiWZ,…]). Can be wildcarded with usual shell wildcards: * ? [<letters>]
dataTypes – dataType of the analysis (all, efficiencyMap or upperLimit) Can be wildcarded with usual shell wildcards: * ? [<letters>]
useSuperseded – If False, the supersededBy results will not be included (deprecated)
useNonValidated – If False, the results with validated = False will not be included
onlyWithExpected – Return only those results that have expected values also. Note that this is trivially fulfilled for all efficiency maps.
- Returns
list of ExpResult objects or the ExpResult object if the list contains only one result
- inNotebook()[source]
Are we running within a notebook? Has an effect on the progressbar we wish to use.
- loadBinaryFile(lastm_only=False)[source]
Load a binary database, returning last modified, file count, database.
- Parameters
lastm_only – if true, the database itself is not read.
- Returns
database object, or None, if lastm_only == True.
- loadDatabase()[source]
if no binary file is available, then load the database and create the binary file. if binary file is available, then check if it needs update, create new binary file, in case it does need an update.
experiment.datasetObj module
- class experiment.datasetObj.CombinedDataSet(expResult)[source]
Bases:
object
Holds the information for a combined dataset (used for combining multiple datasets).
- getDataSet(datasetID)[source]
Returns the dataset with the corresponding dataset ID. If the dataset is not found, returns None.
- Parameters
datasetID – dataset ID (string)
- Returns
DataSet object if found, otherwise None.
- getIndex(dId, datasetOrder)[source]
- get the index of dataset within the datasetOrder,
but allow for abbreviated names
- Parameters
dId – id of dataset to search for, may be abbreviated
datasetOrder – the ordered list of datasetIds, long form
- Returns
index, or -1 if not found
- getLumi()[source]
Return the dataset luminosity. For CombinedDataSet always return the value defined in globalInfo.lumi.
- class experiment.datasetObj.DataSet(path=None, info=None, createInfo=True, discard_zeroes=True, databaseParticles=None)[source]
Bases:
object
Holds the information to a data set folder (TxName objects, dataInfo,…)
- Parameters
discard_zeroes – discard txnames with zero-only results
- checkForRedundancy(databaseParticles)[source]
In case of efficiency maps, check if any txnames have overlapping constraints. This would result in double counting, so we dont allow it.
- getAttributes(showPrivate=False)[source]
Checks for all the fields/attributes it contains as well as the attributes of its objects if they belong to smodels.experiment.
- Parameters
showPrivate – if True, also returns the protected fields (_field)
- Returns
list of field names (strings)
- getEfficiencyFor(txname, mass)[source]
Convenience function. Get efficiency for mass assuming no lifetime rescaling. Same as self.getTxName(txname).getEfficiencyFor(m)
- getLumi()[source]
Return the dataset luminosity. If not defined for the dataset, use the value defined in globalInfo.lumi.
- getSRUpperLimit(expected=False)[source]
Returns the 95% upper limit on the signal*efficiency for a given dataset (signal region). Only to be used for efficiency map type results.
- Parameters
expected – If True, return the expected limit ( i.e. Nobserved = NexpectedBG )
- Returns
upper limit value
- getUpperLimitFor(element=None, expected=False, txnames=None, compute=False, alpha=0.05, deltas_rel=0.2)[source]
Returns the upper limit for a given element (or mass) and txname. If the dataset hold an EM map result the upper limit is independent of the input txname or mass. For UL results if an Element object is given the corresponding upper limit will be rescaled according to the lifetimes of the element intermediate particles. On the other hand, if a mass is given, no rescaling will be applied.
- Parameters
txname – TxName object or txname string (only for UL-type results)
element – Element object or mass array with units (only for UL-type results)
alpha – Can be used to change the C.L. value. The default value is 0.05 (= 95% C.L.) (only for efficiency-map results)
deltas_rel – relative uncertainty in signal (float). Default value is 20%.
expected – Compute expected limit, i.e. Nobserved = NexpectedBG (only for efficiency-map results)
compute – If True, the upper limit will be computed from expected and observed number of events. If False, the value listed in the database will be used instead.
- Returns
upper limit (Unum object)
- getValuesFor(attribute)[source]
Returns a list for the possible values appearing in the ExpResult for the required attribute (sqrts,id,constraint,…). If there is a single value, returns the value itself.
- Parameters
attribute – name of a field in the database (string).
- Returns
list of unique values for the attribute
- isCombinableWith(other)[source]
Function that reports if two datasets are mutually uncorrelated = combinable.
- Parameters
other – datasetObj to compare self with
experiment.defaultFinalStates module
experiment.exceptions module
experiment.expResultObj module
- class experiment.expResultObj.ExpResult(path=None, discard_zeroes=True, databaseParticles=None)[source]
Bases:
object
Object containing the information and data corresponding to an experimental result (experimental conference note or publication).
- Parameters
path – Path to the experimental result folder, None means transient experimental result
discard_zeroes – Discard maps with only zeroes
databaseParticles – the model, i.e. the particle content
- getAttributes(showPrivate=False)[source]
Checks for all the fields/attributes it contains as well as the attributes of its objects if they belong to smodels.experiment.
- Parameters
showPrivate – if True, also returns the protected fields (_field)
- Returns
list of field names (strings)
- getEfficiencyFor(txname, mass, dataset=None)[source]
Convenience function. Get the efficiency for a specific dataset for a a specific txname. Equivalent to: self.getDataset ( dataset ).getEfficiencyFor ( txname, mass )
- getTxnameWith(restrDict={})[source]
Returns a list of TxName objects satisfying the restrictions. The restrictions specified as a dictionary.
- Parameters
restrDict – dictionary containing the fields and their allowed values. E.g. {‘txname’ : ‘T1’, ‘axes’ : ….} The dictionary values can be single entries or a list of values. For the fields not listed, all values are assumed to be allowed.
- Returns
list of TxName objects if more than one txname matches the selection criteria or a single TxName object, if only one matches the selection.
- getUpperLimitFor(dataID=None, alpha=0.05, expected=False, txname=None, mass=None, compute=False)[source]
Computes the 95% upper limit (UL) on the signal cross section according to the type of result. For an Efficiency Map type, returns the UL for the signal*efficiency for the given dataSet ID (signal region). For an Upper Limit type, returns the UL for the signal*BR for the given mass array and Txname.
- Parameters
dataID – dataset ID (string) (only for efficiency-map type results)
alpha – Can be used to change the C.L. value. The default value is 0.05 (= 95% C.L.) (only for efficiency-map results)
expected – Compute expected limit, i.e. Nobserved = NexpectedBG (only for efficiency-map results)
txname – TxName object or txname string (only for UL-type results)
mass – Mass array with units (only for UL-type results)
compute – If True, the upper limit will be computed from expected and observed number of events. If False, the value listed in the database will be used instead.
- Returns
upper limit (Unum object)
- getValuesFor(attribute)[source]
Returns a list for the possible values appearing in the ExpResult for the required attribute (sqrts,id,constraint,…). If there is a single value, returns the value itself.
- Parameters
attribute – name of a field in the database (string).
- Returns
list of unique values for the attribute
experiment.infoObj module
- class experiment.infoObj.Info(path=None)[source]
Bases:
object
Holds the meta data information contained in a .txt file (luminosity, sqrts, experimentID,…). Its attributes are generated according to the lines in the .txt file which contain “info_tag: value”.
- Parameters
path – path to the .txt file
- addInfo(tag, value)[source]
Adds the info field labeled by tag with value value to the object.
- Parameters
tag – information label (string)
value – value for the field in string format
experiment.metaObj module
- class experiment.metaObj.Meta(pathname, discard_zeroes=None, mtime=None, filecount=None, hasFastLim=None, databaseVersion=None, format_version=214, python='3.11.6 (main, Oct 18 2023, 21:49:15) [GCC 11.3.0]')[source]
Bases:
object
- Parameters
pathname – filename of pickle file, or dirname of text files
discard_zeroes – do we discard zeroes?
mtime – last modification time stamps
filecount – number of files
hasFastLim – fastlim in the database?
databaseVersion – version of database
format_version – format version of pickle file
python – python version
- current_version = 214
The Meta object holds all meta information regarding the database, like number of analyses, last time of modification, … This info is needed to understand if we have to re-pickle.
- determineLastModified(force=False)[source]
compute the last modified timestamp, plus count number of files. Only if text db
- lastModifiedSubDir(subdir)[source]
Return the last modified timestamp of subdir (working recursively) plus the number of files.
- Parameters
subdir – directory name that is checked
lastm – the most recent timestamp so far, plus number of files
- Returns
the most recent timestamp, and the number of files
- needsUpdate(current)[source]
do we need an update, with respect to <current>. so <current> is the text database, <self> the pcl.
experiment.txnameObj module
- class experiment.txnameObj.Delaunay1D(data)[source]
Bases:
object
Uses a 1D data array to interpolate the data. The attribute simplices is a list of N-1 pair of ints with the indices of the points forming the simplices (e.g. [[0,1],[1,2],[3,4],…]).
- checkData(data)[source]
Define the simplices according to data. Compute and store the transformation matrix and simplices self.point.
- find_index(xlist, x)[source]
Efficient way to find x in a list. Returns the index (i) of xlist such that xlist[i] < x <= xlist[i+1]. If x > max(xlist), returns the length of the list. If x < min(xlist), returns 0. vertices = np.take(self.tri.simplices, simplex, axis=0) temp = np.take(self.tri.transform, simplex, axis=0) d=temp.shape[2] delta = uvw - temp[:, d]
- Parameters
xlist – List of x-type objects
x – object to be searched for.
- Returns
Index of the list such that xlist[i] < x <= xlist[i+1].
- class experiment.txnameObj.TxName(path, globalObj, infoObj, databaseParticles)[source]
Bases:
object
Holds the information related to one txname in the Txname.txt file (constraint, condition,…) as well as the data.
- addInfo(tag, value)[source]
Adds the info field labeled by tag with value value to the object.
- Parameters
tag – information label (string)
value – value for the field in string format
- fetchAttribute(attr, fillvalue=None)[source]
Auxiliary method to get the attribute from self. If not found, look for it in datasetInfo and if still not found look for it in globalInfo. If not found in either of the above, return fillvalue.
- Parameters
attr – Name of attribute (string)
fillvalue – Value to be returned if attribute is not found.
- Returns
Value of the attribute or fillvalue, if attribute was not found.
- getEfficiencyFor(element)[source]
For upper limit results, checks if the input element falls inside the upper limit grid and has a non-zero reweigthing factor. If it does, returns efficiency = 1, else returns efficiency = 0. For efficiency map results, returns the signal efficiency including the lifetime reweighting. If a mass array is given as input, no lifetime reweighting will be applied.
- Parameters
element – Element object or mass array with units.
- Returns
efficiency (float)
- getInfo(infoLabel)[source]
Returns the value of info field.
- Parameters
infoLabel – label of the info field (string). It must be an attribute of the TxNameInfo object
- getMassVectorFromElement(element)[source]
given element, extract the mass vector for the server query. element can be list of masses or “Element”
- Returns
eg [[300,100],[300,100]]
- getULFor(element, expected=False)[source]
Returns the upper limit (or expected) for element (only for upperLimit-type). Includes the lifetime reweighting (ul/reweight). If called for efficiencyMap results raises an error. If a mass array is given as input, no lifetime reweighting will be applied.
- Parameters
element – Element object or mass array (with units)
expected – look in self.txnameDataExp, not self.txnameData
- hasElementAs(element)[source]
Verify if the conditions or constraint in Txname contains the element. Check both branch orderings. If both orderings match, returns the one with the highest mass array.
- Parameters
element – Element object
- Returns
A copy of the element on the correct branch ordering appearing in the Txname constraint or condition.
- class experiment.txnameObj.TxNameData(value, dataType, Id, accept_errors_upto=0.05, Leff_inner=None, Leff_outer=None)[source]
Bases:
object
Holds the data for the Txname object. It holds Upper limit values or efficiencies.
- Parameters
value – values in string format
dataType – the dataType (upperLimit or efficiencyMap)
Id – an identifier, must be unique for each TxNameData!
_accept_errors_upto – If None, do not allow extrapolations outside of convex hull. If float value given, allow that much relative uncertainty on the upper limit / efficiency when extrapolating outside convex hull. This method can be used to loosen the equal branches assumption.
Leff_inner – is the effective inner radius of the detector, given in meters (used for reweighting prompt decays). If None, default values will be used.
Leff_outer – is the effective outer radius of the detector, given in meters (used for reweighting decays outside the detector). If None, default values will be used.
- computeV(values)[source]
Compute rotation matrix _V, and triangulation self.tri
- Parameters
values – Nested array with the data values without units
- coordinatesToData(point, rotMatrix=None, transVector=None)[source]
A function that return the original mass and width array (including the widths as tuples) for a given point in PCA space (inverse of dataToCoordinates).
- Parameters
point – Point in PCA space (1D list with size equal to self.full_dimensionality or self.dimensionality)
rotMatrix – Rotation matrix for PCA (e.g. self._V). If None, no rotation is performed.
transVector – Translation vector for PCA (e.g. self.delta_x). If None no translation is performed
- Returns
nested mass array including the widths as tuples (e.g. [[(200,1e-10),100],[(200,1e-10),100]])
- dataToCoordinates(dataPoint, rotMatrix=None, transVector=None)[source]
Format a dataPoint to the format used for interpolation. All the units are removed, the widths are rescaled and the masses and widths are combined in a flat array. The input can be an Element object or a massAndWidth nested arrays (with tuples to store the relevant widths).
- Parameters
dataPoint – Element object from which the mass and width arrays will be extracted or a nested mass array from the database, which contain tuples to include the width values
rotMatrix – Rotation matrix for PCA (e.g. self._V). If None, no rotation is performed.
transVector – Translation vector for PCA (e.g. self.delta_x). If None no translation is performed
- Returns
Point (list of floats)
- getDataShape(value)[source]
Stores the data format (mass shape) and store it for future use. If there are inclusive objects (mass or branch = None), store their positions.
- Parameters
value – list of data points
- getUnits(value)[source]
Get standard units for the input object. Uses the units defined in physicsUnits.standardUnits. (e.g. [[100*GeV,100.*GeV],3.*pb] -> returns [[GeV,GeV],fb] [[100*GeV,3.],[200.*GeV,2.*pb]] -> returns [[GeV,1.],[GeV,fb]] )
- Parameters
value – Object containing units (e.g. [[100*GeV,100.*GeV],3.*pb])
- Returns
Object with same structure containing the standard units used to normalize the data.
- getValueFor(element)[source]
Interpolates the value and returns the UL or efficiency for the respective element rescaled according to the reweighting function self.reweightF. For UL-type data the default rescaling is ul -> ul/(fraction of prompt decays) and for EM-type data it is eff -> eff*(fraction of prompt decays). If a mass array is given as input, no lifetime reweighting will be applied.
- Parameters
element – Element object or mass array (with units)
- getValueForPoint(point)[source]
Returns the UL or efficiency for the point (in coordinates) using interpolation
- Parameters
point – Point in coordinate space (length = self.full_dimensionality)
- Returns
Value of UL or efficiency (float) without units
- getWidthPosition(value)[source]
Gets the positions of the widths to be used for interpolation.
- Parameters
value – data point
- Returns
A list with the position of the widths. A position is a tuple of the form (branch-index,vertex-index).
- interpolate(point, fill_value=nan)[source]
Returns the interpolated value for the point (in coordinates)
- Parameters
point – Point in coordinate space (length = self.dimensionality)
- Returns
Value for point without units