Common

This page contains the docs for all shared components that can be found under InnerEye/Common/.

class InnerEye.Common.common_util.ModelProcessing(value)[source]

Enum used in model training and inference, used to decide where to put files and what logging messages to print. The meanings of the values are:

ENSEMBLE_CREATION: we are creating and processing an ensemble model from within the child run with cross-validation index 0 of the HyperDrive run that created this model.

DEFAULT: any other situation, including where the model is an ensemble model created by an earlier run (so the current run is standalone, not part of a HyperDrive run).

There are four scenarios, only one of which uses ModelProcessing.ENSEMBLE_CREATION:

Training and inference on a single model in a single (non-HyperDrive) run.

Training and inference on a single model that is part of an ensemble, in HyperDrive child run.

Inference on an ensemble model taking place in a HyperDrive child run that trained one of the component models of the ensemble and whose cross validation index is 0.

Inference on a single or ensemble model created in an another run specified by the value of run_recovery_id.

The scenarios occur under the following conditions:

Scenario 1 happens when we train a model (train=True) with number_of_cross_validation_splits=0. In this case, the value of ModelProcessing passed around is DEFAULT.

Scenario 2 happens when we train a model (train=True) with number_of_cross_validation_splits > 0. In this case, the value of ModelProcessing passed around is DEFAULT in each of the child runs while training and running inference on its own single model. However, the child run whose cross validation index is 0 then goes on to carry out Scenario 3, and does more processing with ModelProcessing value ENSEMBLE_CREATION, to create and register the ensemble model, run inference on it, and upload information about the ensemble model to the parent run.

Scenario 4 happens when we do an inference-only run (train=False), and specify an existing model with run_recovery_id (and necessarily number_of_cross_validation_splits=0, even if the recovered run was a HyperDrive one). This model may be either a single one or an ensemble one; in both cases, a ModelProcessing value of DEFAULT is used.

DEFAULT = 'default'

ENSEMBLE_CREATION = 'ensemble_creation'

InnerEye.Common.common_util.any_pairwise_larger(items1: Any, items2: Any) → bool[source]: Returns True if any of the elements of items1 is larger than the corresponding element in items2. The two lists must have the same length.

InnerEye.Common.common_util.any_smaller_or_equal_than(items: Iterable[Any], scalar: float) → bool[source]: Returns True if any of the elements of the list is smaller than the given scalar number.

InnerEye.Common.common_util.change_working_directory(path_or_str: Union[Path, str]) → Generator[source]: Context manager for changing the current working directory

InnerEye.Common.common_util.check_is_any_of(message: str, actual: Optional[str], valid: Iterable[Optional[str]]) → None[source]

Raises an exception if ‘actual’ is not any of the given valid values.

Parameters:

message – The prefix for the error message.
actual – The actual value.
valid – The set of valid strings that ‘actual’ is allowed to take on.

InnerEye.Common.common_util.check_properties_are_not_none(obj: Any, ignore: Optional[List[str]] = None) → None[source]: Checks to make sure the provided object has no properties that have a None value assigned.

InnerEye.Common.common_util.disable_logging_to_file() → None[source]: If logging to a file has been enabled previously via logging_to_file, this call will remove that logging handler.

InnerEye.Common.common_util.empty_string_to_none(x)

InnerEye.Common.common_util.get_best_epoch_results_path(mode: ModelExecutionMode, model_proc: ModelProcessing = ModelProcessing.DEFAULT) → Path[source]

For a given model execution mode, creates the relative results path in the form BEST_EPOCH_FOLDER_NAME/(Train, Test or Val)

Parameters:

mode – model execution mode
model_proc – whether this is for an ensemble or single model. If ensemble, we return a different path to avoid colliding with the results from the single model that may have been created earlier in the same run.

InnerEye.Common.common_util.get_items_from_string(string: str, separator: str = ',', remove_blanks: bool = True) → List[str][source]: Returns a list of items, separated by a known symbol, from a given string.

InnerEye.Common.common_util.get_log_level_string(log_level: int) → str[source]

Parameters:: log_level – integer version of a log level, e.g. 20.
Returns:: string version of the level; throws an error if the level is not registered.

InnerEye.Common.common_util.initialize_instance_variables(func: Callable) → Callable[source]

Automatically assigns the input parameters. Example usage:

class process:
    @initialize_instance_variables
    def __init__(self, cmd, reachable=False, user='root'):
        pass
p = process('halt', True)
print(p.cmd, p.reachable, p.user)

Outputs:

('halt', True, 'root')

InnerEye.Common.common_util.is_gpu_tensor(data: Any) → bool[source]

InnerEye.Common.common_util.is_linux() → bool[source]: Returns True if the host operating system is a flavour of Linux.

InnerEye.Common.common_util.is_long_path(path: Union[Path, str]) → bool[source]: A long path is a path that has more than 260 characters

InnerEye.Common.common_util.is_private_field_name(name: str) → bool[source]: A private field is any Python class member that starts with an underscore eg: _hello

InnerEye.Common.common_util.is_windows() → bool[source]: Returns True if the host operating system is Windows.

InnerEye.Common.common_util.logging_only_to_file(file_path: Path, stdout_log_level: Union[int, str] = 40) → Generator[source]

Redirects logging to the specified file, undoing that on exit. If logging is currently going to stdout, messages at level stdout_log_level or higher (typically ERROR) are also sent to stdout. Usage: with logging_only_to_file(my_log_path): do_stuff()

Parameters:

file_path – file to log to
stdout_log_level – mininum level for messages to also go to stdout

InnerEye.Common.common_util.logging_section(gerund: str) → Generator[source]

Context manager to print “**** STARTING: …” and “**** FINISHED: …” lines around sections of the log, to help people locate particular sections. Usage: with logging_section(“doing this and that”):

do_this_and_that()

Parameters:: gerund – string expressing what happens in this section of the log.

InnerEye.Common.common_util.logging_to_file(file_path: Path) → None[source]

Instructs the Python logging libraries to start writing logs to the given file. Logging will use a timestamp as the prefix, using UTC. The logging level will be the same as defined for logging to stdout.

Parameters:: file_path – The path and name of the file to write to.

InnerEye.Common.common_util.logging_to_stdout(log_level: Union[int, str] = 20) → None[source]

Instructs the Python logging libraries to start writing logs to stdout up to the given logging level. Logging will use a timestamp as the prefix, using UTC.

Parameters:: log_level – The logging level. All logging message with a level at or above this level will be written to stdout. log_level can be numeric, or one of the pre-defined logging strings (loging.INFO, logging.DEBUG, etc.).

InnerEye.Common.common_util.merge_conda_files(conda_files: List[Path], result_file: Path, pip_files: Optional[List[Path]] = None) → None[source]

Merges the given Conda environment files using the conda_merge package, optionally adds any dependencies from pip requirements files, and writes the merged file to disk.

Parameters:

conda_files – The Conda environment files to read.
result_file – The location where the merge results should be written.
pip_files – An optional list of one or more pip requirements files including extra dependencies.

InnerEye.Common.common_util.namespace_to_path(namespace: str, root: Union[Path, str] = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/innereye-deeplearning/checkouts/latest')) → Path[source]

Given a namespace (in form A.B.C) and an optional root directory R, create a path R/A/B/C

Parameters:

namespace – Namespace to convert to path
root – Path to prefix (default is project root)

InnerEye.Common.common_util.path_to_namespace(path: Path, root: Union[Path, str] = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/innereye-deeplearning/checkouts/latest')) → str[source]

Given a path (in form R/A/B/C) and an optional root directory R, create a namespace A.B.C. If root is provided, then path must be a relative child to it.

Parameters:

path – Path to convert to namespace
root – Path prefix to remove from namespace (default is project root)

Returns:

String representation to path of namespace

InnerEye.Common.common_util.print_exception(ex: Exception, message: str, logger_fn: ~typing.Callable = <function error>) → None[source]

Prints information about an exception, and the full traceback info.

Parameters:

ex – The exception that was caught.
message – An additional prefix that is printed before the exception itself.
logger_fn – The logging function to use for logging this exception

InnerEye.Common.common_util.remove_file_or_directory(pth: Path) → None[source]: Remove a directory and its contents, or a file.

InnerEye.Common.common_util.standardize_log_level(log_level: Union[int, str]) → int[source]

Parameters:: log_level – integer or string (any casing) version of a log level, e.g. 20 or “INFO”.
Returns:: integer version of the level; throws if the string does not name a level.

InnerEye.Common.common_util.string_to_path(x)

InnerEye.Common.fixed_paths_for_tests.full_ml_test_data_path(path: str = '') → Path[source]

Takes a relative path inside of the Tests/ML/test_data folder, and returns its full absolute path.

Parameters:: path – A path relative to the ML/tests/test_data
Returns:: The full absolute path of the argument.

InnerEye.Common.fixed_paths_for_tests.tests_root_directory(path: Optional[Union[Path, str]] = None) → Path[source]

Gets the full path to the root directory that holds the tests. If a relative path is provided then concatenate it with the absolute path to the repository root.

Returns:: The full path to the repository’s root directory, with symlinks resolved if any.

InnerEye.Common.fixed_paths.add_submodules_to_path() → None[source]: This function adds all submodules that the code uses to sys.path and to the environment variables. This is necessary to make the code work without any further changes when switching from/to using hi-ml as a package or as a submodule for development. It also adds the InnerEye root folder to sys.path. The latter is necessary to make AzureML and Pytorch Lightning work together: When spawning additional processes for DDP, the working directory is not correctly picked up in sys.path.

InnerEye.Common.fixed_paths.get_environment_yaml_file() → Path[source]: Returns the path where the environment.yml file is located. This can be inside of the InnerEye package, or in the repository root when working with the code as a submodule. The function throws an exception if the file is not found at either of the two possible locations. :return: The full path to the environment files.

InnerEye.Common.fixed_paths.repository_parent_directory(path: Optional[Union[Path, str]] = None) → Path[source]

Gets the full path to the parent directory that holds the present repository.

Parameters:: path – if provided, a relative path to append to the absolute path to the repository root.
Returns:: The full path to the repository’s root directory, with symlinks resolved if any.

InnerEye.Common.fixed_paths.repository_root_directory(path: Optional[Union[Path, str]] = None) → Path[source]

Gets the full path to the root directory that holds the present repository.

Parameters:: path – if provided, a relative path to append to the absolute path to the repository root.
Returns:: The full path to the repository’s root directory, with symlinks resolved if any.

class InnerEye.Common.generic_parsing.GenericConfig(should_validate: bool = True, throw_if_unknown_param: bool = False, **params: Any)[source]

Base class for all configuration classes provides helper functionality to create argparser.

add_and_validate(kwargs: Dict[str, Any], validate: bool = True) → None[source]: Add further parameters and, if validate is True, validate. We first try set_param, but that fails when the parameter has a setter.

classmethod add_args(parser: ArgumentParser) → ArgumentParser[source]

Adds all overridable fields of the current class to the given argparser. Fields that are marked as readonly, constant or private are ignored.

Parameters:: parser – Parser to add properties to.

apply_overrides(values: Optional[Dict[str, Any]], should_validate: bool = True, keys_to_ignore: Optional[Set[str]] = None) → Dict[str, Any][source]

Applies the provided values overrides to the config. Only properties that are marked as overridable are actually overwritten.

Parameters:

values – A dictionary mapping from field name to value.
should_validate – If true, run the .validate() method after applying overrides.
keys_to_ignore – keys to ignore in reporting failed overrides. If None, do not report.

Returns:

A dictionary with all the fields that were modified.

classmethod create_argparser() → ArgumentParser[source]: Creates an ArgumentParser with all fields of the given argparser that are overridable. :return: ArgumentParser

classmethod get_overridable_parameters() → Dict[str, Parameter][source]

Get properties that are not constant, readonly or private (eg: prefixed with an underscore).

Returns:: A dictionary of parameter names and their definitions.

name = 'GenericConfig'

param = <param.parameterized.Parameters object>

classmethod parse_args(args: Optional[List[str]] = None) → T[source]: Creates an argparser based on the params class and parses stdin args (or the args provided)

static reason_not_overridable(value: Parameter) → Optional[str][source]

Parameters:: value – a parameter value
Returns:: None if the parameter is overridable; otherwise a one-word string explaining why not.

report_on_overrides(values: Dict[str, Any], keys_to_ignore: Set[str]) → None[source]

Logs a warning for every parameter whose value is not as given in “values”, other than those in keys_to_ignore.

Parameters:

values – override dictionary, parameter names to values
keys_to_ignore – set of dictionary keys not to report on

Returns:

None

validate() → None[source]: Validation method called directly after init to be overridden by children if required

class InnerEye.Common.generic_parsing.IntTuple(default=(0, 0), length=None, **params)[source]: Parameter class that must always have integer values

class InnerEye.Common.generic_parsing.ListOrDictParam(default=None, doc=None, label=None, precedence=None, instantiate=False, constant=False, readonly=False, pickle_default_value=True, allow_None=False, per_instance=True)[source]: Wrapper class to allow either a List or Dict inside of a Parameterized object.

class InnerEye.Common.generic_parsing.PathOrPathList(default=None, doc=None, label=None, precedence=None, instantiate=False, constant=False, readonly=False, pickle_default_value=True, allow_None=False, per_instance=True)[source]

Wrapper class to allow either a Path or a list of Paths. Internally represented always as a list.

set_hook(obj: Any, val: Any) → Any[source]: Modifies the value before calling the setter. Here, we are converting simple path to lists of path.

class InnerEye.Common.generic_parsing.StringOrStringList(default=None, doc=None, label=None, precedence=None, instantiate=False, constant=False, readonly=False, pickle_default_value=True, allow_None=False, per_instance=True)[source]

Wrapper class to allow either a string or a list of strings. Internally represented always as a list.

set_hook(obj: Any, val: Any) → Any[source]: Modifies the value before calling the setter. Here, we are converting all strings to lists of strings.

InnerEye.Common.generic_parsing.create_from_matching_params(from_object: Parameterized, cls_: Type[T]) → T[source]

Creates an object of the given target class, and then copies all attributes from the from_object to the newly created object, if there is a matching attribute. The target class must be a subclass of param.Parameterized.

Parameters:

from_object – The object to read attributes from.
cls – The name of the class for the newly created object.

Returns:

An instance of cls_

class InnerEye.Common.metrics_constants.LoggingColumns(value)[source]

This enum contains string constants that act as column names in logging, and in all files on disk.

AccuracyAtOptimalThreshold = 'accuracy_at_optimal_threshold'

AccuracyAtThreshold05 = 'accuracy_at_threshold_05'

AreaUnderPRCurve = 'area_under_pr_curve'

AreaUnderRocCurve = 'area_under_roc_curve'

CrossEntropy = 'cross_entropy'

CrossValidationSplitIndex = 'cross_validation_split_index'

DataSplit = 'data_split'

Dice = 'dice'

Epoch = 'epoch'

ExplainedVariance = 'explained_variance'

FalseNegativeRateAtOptimalThreshold = 'false_negative_rate_at_optimal_threshold'

FalsePositiveRateAtOptimalThreshold = 'false_positive_rate_at_optimal_threshold'

HausdorffDistanceMM = 'HausdorffDistanceMM'

Hue = 'prediction_target'

Institution = 'institutionId'

Label = 'label'

LearningRate = 'learning_rate'

Loss = 'loss'

MeanAbsoluteError = 'mean_absolute_error'

MeanSquaredError = 'mean_squared_error'

ModelExecutionMode = 'model_execution_mode'

ModelOutput = 'model_output'

NumTrainableParameters = 'num_trainable_parameters'

OptimalThreshold = 'optimal_threshold'

Patient = 'subject'

SequenceLength = 'sequence_length'

Series = 'seriesId'

Structure = 'structure'

SubjectCount = 'subject_count'

Tags = 'tags'

class InnerEye.Common.metrics_constants.MetricType(value)[source]

Contains the different metrics that are computed.

ACCURACY_AT_OPTIMAL_THRESHOLD = 'AccuracyAtOptimalThreshold'

ACCURACY_AT_THRESHOLD_05 = 'AccuracyAtThreshold05'

AREA_UNDER_PR_CURVE = 'AreaUnderPRCurve'

AREA_UNDER_ROC_CURVE = 'AreaUnderRocCurve'

CROSS_ENTROPY = 'CrossEntropy'

DICE = 'Dice'

EXPLAINED_VAR = 'ExplainedVariance'

FALSE_NEGATIVE_RATE_AT_OPTIMAL_THRESHOLD = 'FalseNegativeRateAtOptimalThreshold'

FALSE_POSITIVE_RATE_AT_OPTIMAL_THRESHOLD = 'FalsePositiveRateAtOptimalThreshold'

HAUSDORFF_mm = 'HausdorffDistance_millimeters'

LEARNING_RATE = 'LearningRate'

LOSS = 'Loss'

MEAN_ABSOLUTE_ERROR = 'MeanAbsoluteError'

MEAN_SQUARED_ERROR = 'MeanSquaredError'

MEAN_SURFACE_DIST_mm = 'MeanSurfaceDistance_millimeters'

OPTIMAL_THRESHOLD = 'OptimalThreshold'

PATCH_CENTER = 'PatchCenter'

PROPORTION_FOREGROUND_VOXELS = 'ProportionForegroundVoxels'

SUBJECT_COUNT = 'SubjectCount'

VOXEL_COUNT = 'VoxelCount'

class InnerEye.Common.metrics_constants.MetricsFileColumns(value)[source]

Contains the names of the columns in the CSV file that is written by model testing.

Dice = 'Dice'

DiceNumeric = 'DiceNumeric'

HausdorffDistanceMM = 'HausdorffDistance_mm'

MeanDistanceMM = 'MeanDistance_mm'

Patient = 'Patient'

Structure = 'Structure'

class InnerEye.Common.metrics_constants.TrackedMetrics(value)[source]

Known metrics that are tracked as part of Hyperdrive runs.

Val_Loss = 'val/Loss'

class InnerEye.Common.output_directories.OutputFolderForTests(root_dir: Path)[source]

Data class for the output directories for a given test

create_file_or_folder_path(file_or_folder_name: str) → Path[source]

Creates a full path for the given file or folder name relative to the root directory stored in the present object.

Parameters:: file_or_folder_name – Name of file or folder to be created under root_dir

make_sub_dir(dir_name: str) → Path[source]

Makes a sub directory under root_dir

Parameters:: dir_name – Name of subdirectory to be created.

root_dir: Path

InnerEye.Common.output_directories.remove_and_create_folder(folder: Union[Path, str]) → None[source]: Delete the folder if it exists, and remakes it. This method ignores errors that can come from an explorer window still being open inside of the test result folder.

class InnerEye.Common.resource_monitor.GpuUtilization(id: 'int', load: 'float', mem_util: 'float', mem_allocated_gb: 'float', mem_reserved_gb: 'float', count: 'int')[source]

average() → GpuUtilization[source]: Returns a GPU utilization object that contains all metrics of the present object, divided by the number of observations. :return: The GPU utilization object.

count: int

enumerate(prefix: str = '') → List[Tuple[str, float]][source]

Lists all metrics stored in the present object, as (metric_name, value) pairs suitable for logging in Tensorboard.

Parameters:: prefix – If provided, this string as used as an additional prefix for the metric name itself. If prefix is “max”, the metric would look like “maxLoad_Percent”
Returns:: A list of (name, value) tuples.

static from_gpu(gpu: GPU) → GpuUtilization[source]

Creates a GpuUtilization object from data coming from the gputil library.

Parameters:: gpu – GPU diagnostic data from gputil.
Returns:: GpuUtilization object

id: int

load: float

max(other: GpuUtilization) → GpuUtilization[source]

Computes the metric-wise maximum of the two GpuUtilization objects.

Parameters:: other – The other GpuUtilization object.
Returns:: The metric-wise maximum of the two GpuUtilization objects.

mem_allocated_gb: float

mem_reserved_gb: float

mem_util: float

property name: str: Gets a string name for the GPU that the present objet describes, “GPU1” for GPU with id == 1.

class InnerEye.Common.resource_monitor.ResourceMonitor(interval_seconds: int, tensorboard_folder: Path, csv_results_folder: Path)[source]

Monitor and log GPU and CPU stats in TensorBoard in a separate process.

log_to_tensorboard(label: str, value: float) → None[source]

Write a scalar metric value to Tensorboard, marked with the present step.

Parameters:

label – The name of the metric.
value – The value.

read_aggregate_metrics() → Dict[str, Dict[str, float]][source]: Reads the file containing aggregate metrics, and returns them parsed as nested dictionaries mapping from GPU name to metric name to value.

run() → None[source]: Method to be run in sub-process; can be overridden in sub-class

store_to_file() → None[source]: Writes the current aggregate metrics (average and maximum) to a file inside the csv_results_folder.

update_metrics(gpus: List[GPU]) → None[source]

Updates the stored GPU utilization metrics with the current status coming from gputil, and logs them to Tensorboard.

Parameters:: gpus – The current utilization information, read from gputil, for all available GPUs.

InnerEye.Common.resource_monitor.memory_in_gb(bytes: int) → float[source]

Converts a memory amount in bytes to gigabytes.

Parameters:: bytes – Number of bytes
Returns:: Equivalent memory amount in gigabytes

InnerEye.Common.spawn_subprocess.spawn_and_monitor_subprocess(process: str, args: List[str], env: Optional[Dict[str, str]] = None) → Tuple[int, List[str]][source]

Helper function to start a subprocess, passing in a given set of arguments, and monitor it. Returns the subprocess exit code and the list of lines written to stdout.

Parameters:

process – The name and path of the executable to spawn.
args – The args to the process.
env – The environment variables that the new process will run with. If not provided, copy the environment from the current process.

Returns:

Return code after the process has finished, and the list of lines that were written to stdout by the subprocess.