API Reference
Getting started is easy.
Segmentation models
- segmodels_keras.Unet(backbone_name: str = 'vgg16', input_shape: tuple[int | None, int | None, int] = (None, None, 3), classes: int = 1, activation: str = 'sigmoid', weights: str | Path | None = None, weights_notop: str | Path | None = None, freeze_notop: bool = False, encoder_weights: str | None = 'imagenet', encoder_freeze: bool = False, encoder_features: str | list[int | str] = 'default', decoder_block_type: str = 'upsampling', decoder_filters: tuple[int, ...] = (256, 128, 64, 32, 16), decoder_use_batchnorm: bool = True, **kwargs) Model
Unet is a fully convolution neural network for image semantic segmentation.
- Parameters:
backbone_name – name of classification model (without last dense layers) used as feature extractor to build segmentation model.
input_shape – shape of input data/image
(H, W, C), in general case you do not need to setHandWshapes, just pass(None, None, C)to make your model be able to process images of any size, butHandWof input images should be divisible by factor32.classes – a number of classes for output (output shape -
(h, w, classes)).activation – name of one of
keras.activationsfor last model layer (e.g.sigmoid,softmax,linear).weights – path to model weights for the entire model to be loaded.
weights_notop – path to model weights without top layer to be loaded.
freeze_notop – if
True, set all layers of the model except the top layer as non-trainable.encoder_weights – one of
None(random initialization),imagenet(pre-training on ImageNet).encoder_freeze – if
Trueset all layers of encoder (backbone model) as non-trainable.encoder_features – a list of layer numbers or names starting from top of the model. Each of these layers will be concatenated with corresponding decoder block. If
defaultis used layer names are taken fromDEFAULT_SKIP_CONNECTIONS.decoder_block_type –
one of blocks with following layers structure:
upsampling:
UpSampling2D->Conv2D->Conv2Dtranspose:
Transpose2D->Conv2D
decoder_filters – list of numbers of
Conv2Dlayer filters in decoder blocksdecoder_use_batchnorm – if
True,BatchNormalisationlayer betweenConv2DandActivationlayers is used.kwargs – additional parameters for backbone model.
- Returns:
Unet
- Return type:
keras.models.Model
- segmodels_keras.Linknet(backbone_name: str = 'vgg16', input_shape: tuple[int | None, int | None, int] = (None, None, 3), classes: int = 1, activation: str = 'sigmoid', weights: str | Path | None = None, weights_notop: str | Path | None = None, freeze_notop: bool = False, encoder_weights: str | None = 'imagenet', encoder_freeze: bool = False, encoder_features: str | list[int | str] = 'default', decoder_block_type: str = 'upsampling', decoder_filters: tuple[int | None, ...] = (None, None, None, None, 16), decoder_use_batchnorm: bool = True, **kwargs) Model
Linknet is a fully convolution neural network for fast semantic segmentation.
Note
This implementation by default has 4 skip connections (original - 3).
- Parameters:
backbone_name – name of classification model (without last dense layers) used as feature extractor to build segmentation model.
input_shape – shape of input data/image
(H, W, C), in general case you do not need to setHandWshapes, just pass(None, None, C)to make your model be able to process images af any size, butHandWof input images should be divisible by factor32.classes – a number of classes for output (output shape -
(h, w, classes)).activation – name of one of
keras.activationsfor last model layer (e.g.sigmoid,softmax,linear).weights – optional, path to model weights to be loaded.
weights_notop – optional, path to model weights without top (without segmentation head) to be loaded.
freeze_notop – if
True, set all layers of the model except the top as non-trainable.encoder_weights – one of
None(random initialization),imagenet(pre-training on ImageNet).encoder_freeze – if
Trueset all layers of encoder (backbone model) as non-trainable.encoder_features – a list of layer numbers or names starting from top of the model. Each of these layers will be concatenated with corresponding decoder block. If
defaultis used layer names are taken fromDEFAULT_SKIP_CONNECTIONS.decoder_filters – list of numbers of
Conv2Dlayer filters in decoder blocks, for block with skip connection a number of filters is equal to number of filters in corresponding encoder block (estimates automatically and can be passed asNonevalue).decoder_use_batchnorm – if
True,BatchNormalisationlayer betweenConv2DandActivationlayers is used.decoder_block_type – one of - upsampling: use
UpSampling2Dkeras layer - transpose: useTranspose2Dkeras layerkwargs – additional parameters for backbone model.
- Returns:
Linknet
- Return type:
keras.models.Model
- segmodels_keras.FPN(backbone_name: str = 'vgg16', input_shape: tuple[int | None, int | None, int] = (None, None, 3), classes: int = 21, activation: str = 'softmax', weights: str | Path | None = None, weights_notop: str | Path | None = None, freeze_notop: bool = False, encoder_weights: str | None = 'imagenet', encoder_freeze: bool = False, encoder_features: str | list[int | str] = 'default', pyramid_block_filters: int = 256, pyramid_use_batchnorm: bool = True, pyramid_aggregation: str = 'concat', pyramid_dropout: float | None = None, **kwargs) Model
FPN is a fully convolution neural network for image semantic segmentation.
- Parameters:
backbone_name – name of classification model (without last dense layers) used as feature extractor to build segmentation model.
input_shape – shape of input data/image
(H, W, C), in general case you do not need to setHandWshapes, just pass(None, None, C)to make your model be able to process images af any size, butHandWof input images should be divisible by factor32.classes – a number of classes for output (output shape -
(h, w, classes)).weights – optional, path to model weights to be loaded.
weights_notop – optional, path to model weights without top (without segmentation head) to be loaded.
freeze_notop – if
True, set all layers of the model except the top as non-trainable.activation – name of one of
keras.activationsfor last model layer (e.g.sigmoid,softmax,linear).encoder_weights – one of
None(random initialization),imagenet(pre-training on ImageNet).encoder_freeze – if
Trueset all layers of encoder (backbone model) as non-trainable.encoder_features – a list of layer numbers or names starting from top of the model. Each of these layers will be used to build features pyramid. If
defaultis used layer names are taken fromDEFAULT_FEATURE_PYRAMID_LAYERS.pyramid_block_filters – a number of filters in Feature Pyramid Block of FPN.
pyramid_use_batchnorm – if
True,BatchNormalisationlayer betweenConv2DandActivationlayers is used.pyramid_aggregation – one of ‘sum’ or ‘concat’. The way to aggregate pyramid blocks.
pyramid_dropout – spatial dropout rate for feature pyramid in range (0, 1).
kwargs – additional parameters for backbone model.
- Returns:
FPN
- Return type:
keras.models.Model
- segmodels_keras.PSPNet(backbone_name: str = 'vgg16', input_shape: tuple[int, int, int] = (384, 384, 3), classes: int = 21, activation: str = 'softmax', weights: str | Path | None = None, weights_notop: str | Path | None = None, freeze_notop: bool = False, encoder_weights: str | None = 'imagenet', encoder_freeze: bool = False, downsample_factor: int = 8, psp_conv_filters: int = 512, psp_pooling_type: str = 'avg', psp_use_batchnorm: bool = True, psp_dropout: float | None = None, **kwargs) Model
PSPNet is a fully convolution neural network for image semantic segmentation.
- Parameters:
backbone_name – name of classification model used as feature extractor to build segmentation model.
input_shape – shape of input data/image
(H, W, C).HandWshould be divisible by6 * downsample_factorand NOTNone!classes – a number of classes for output (output shape -
(h, w, classes)).activation – name of one of
keras.activationsfor last model layer (e.g.sigmoid,softmax,linear).weights – optional, path to model weights to be loaded.
weights_notop – optional, path to model weights without top (without segmentation head) to be loaded.
freeze_notop – if
True, set all layers of the model except the top as non-trainable.encoder_weights – one of
None(random initialization),imagenet(pre-training on ImageNet).encoder_freeze – if
Trueset all layers of encoder (backbone model) as non-trainable.downsample_factor – one of 4, 8 and 16. Downsampling rate or in other words backbone depth to construct PSP module on it.
psp_conv_filters – number of filters in
Conv2Dlayer in each PSP block.psp_pooling_type – one of ‘avg’, ‘max’. PSP block pooling type (maximum or average).
psp_use_batchnorm – if
True,BatchNormalisationlayer betweenConv2DandActivationlayers is used.psp_dropout – dropout rate between 0 and 1.
kwargs – additional keyword arguments for some backbones (e.g.
groupsforresnext50andresnext101backbones).
- Returns:
PSPNet
- Return type:
keras.models.Model
- segmodels_keras.get_available_backbone_names()
Get the list of available backbone names.
- segmodels_keras.get_preprocessing(name: str) Callable[[Any], Any]
Get the preprocessing function for a backbone by name.
- segmodels_keras.get_model(model_name: Literal['unet', 'linknet', 'pspnet', 'fpn'], backbone_name: str = 'vgg16', input_shape: tuple[int | None, int | None, int] = (None, None, 3), classes: int = 1, activation: str = 'sigmoid', weights: str | None = None, weights_notop: str | None = None, freeze_notop: bool = False, encoder_weights: str | None = 'imagenet', encoder_freeze: bool = False, **kwargs)
Create a segmentation model with common constructor parameters.
- Parameters:
model_name – Name of the model to create. One of ‘unet’, ‘linknet’, ‘pspnet’, ‘fpn’.
backbone_name – Name of the backbone model. Default is ‘vgg16’.
input_shape – Shape of input data (H, W, C). Default is (None, None, 3).
classes – Number of output classes. Default is 1.
activation – Activation function for the last layer. Default is ‘sigmoid’.
weights – Path to model weights to be loaded. Default is None.
weights_notop – Path to model weights without top (segmentation head) to be loaded. Default is None.
freeze_notop – If True, set all layers except the top as non-trainable. Default is False.
encoder_weights – One of None (random initialization) or ‘imagenet’ (pre-training on ImageNet). Default is ‘imagenet’.
encoder_freeze – If True, set all encoder layers as non-trainable. Default is False.
**kwargs –
Additional model-specific parameters. For example: - For Unet/Linknet: encoder_features, decoder_block_type, decoder_filters,
decoder_use_batchnorm
For PSPNet: downsample_factor, psp_conv_filters, psp_pooling_type, psp_use_batchnorm, psp_dropout
For FPN: encoder_features, pyramid_block_filters, pyramid_use_batchnorm, pyramid_aggregation, pyramid_dropout
- Returns:
A compiled Keras segmentation model.
- Raises:
ValueError – If model_name is not recognized.
Example
>>> model = get_model( ... model_name="unet", ... backbone_name="resnet50", ... classes=3, ... activation="softmax" ... ) >>> model_psp = get_model( ... model_name="pspnet", ... backbone_name="vgg16", ... input_shape=(384, 384, 3), ... classes=21, ... activation="softmax" ... )
metrics
- segmodels_keras.metrics.IOUScore(class_weights: Any = None, class_indexes: Any = None, threshold: int | float | None = None, per_image: bool = False, smooth: float = 1e-05, name: str | None = None) None
The Jaccard index.
Also known as Intersection over Union and the Jaccard similarity coefficient (originally coined coefficient de communauté by Paul Jaccard), is a statistic used for comparing the similarity and diversity of sample sets. The Jaccard coefficient measures similarity between finite sample sets, and is defined as the size of the intersection divided by the size of the union of the sample sets.
\[J(A, B) = \frac{A \cap B}{A \cup B}\]- Parameters:
class_weights – 1. or
np.arrayof class weights (len(weights) = num_classes).class_indexes – Optional integer or list of integers, classes to consider, if
Noneall classes are used.smooth – value to avoid division by zero
per_image – if
True, metric is calculated as mean over images in batch (B), else over whole batchthreshold – value to round predictions (use
>comparison), ifNoneprediction will not be round
- Returns:
- A callable
iou_scoreinstance. Can be used inmodel.compile(...) function.
- A callable
Example:
metric = IOUScore() model.compile('SGD', loss=loss, metrics=[metric])
- segmodels_keras.metrics.FScore(beta: int | float = 1, class_weights: Any = None, class_indexes: Any = None, threshold: int | float | None = None, per_image: bool = False, smooth: float = 1e-05, name: str | None = None) None
The F-score (Dice coefficient).
This can be interpreted as a weighted average of the precision and recall, where an F-score reaches its best value at 1 and worst score at 0. The relative contribution of
precisionandrecallto the F1-score are equal.The formula for the F score is:
\[F_\beta(precision, recall) = (1 + \beta^2) \frac{precision \cdot recall} {\beta^2 \cdot precision + recall}\]The formula in terms of Type I and Type II errors:
\[L(tp, fp, fn) = \frac{(1 + \beta^2) \cdot tp} {(1 + \beta^2) \cdot fp + \beta^2 \cdot fn + fp}\]- where:
tp - true positives;
fp - false positives;
fn - false negatives;
- Parameters:
beta – Integer of float f-score coefficient to balance precision and recall.
class_weights – 1. or
np.arrayof class weights (len(weights) = num_classes)class_indexes – Optional integer or list of integers, classes to consider, if
Noneall classes are used.smooth – Float value to avoid division by zero.
per_image – If
True, metric is calculated as mean over images in batch (B), else over whole batch.threshold – Float value to round predictions (use
>comparison), ifNoneprediction will not be round.name – Optional string, if
Nonedefaultf{beta}-scorename is used.
- Returns:
A callable
f_scoreinstance. Can be used inmodel.compile(...)function.
Example:
metric = FScore() model.compile('SGD', loss=loss, metrics=[metric])
losses
- segmodels_keras.losses.JaccardLoss(class_weights: Any = None, class_indexes: Any = None, per_image: bool = False, smooth: float = 1e-05) None
Creates a criterion to measure Jaccard loss.
Details:
\[L(A, B) = 1 - \frac{A \cap B}{A \cup B}\]- Parameters:
class_weights – Array (
np.array) of class weights (len(weights) = num_classes).class_indexes – Optional integer or list of integers, classes to consider, if
Noneall classes are used.per_image – If
Trueloss is calculated for each image in batch and then averaged, else loss is calculated for the whole batch.smooth – Value to avoid division by zero.
- Returns:
A callable
jaccard_lossinstance. Can be used inmodel.compile(...)function or combined with other losses.
Example:
loss = JaccardLoss() model.compile('SGD', loss=loss)
- segmodels_keras.losses.DiceLoss(beta: int | float = 1, class_weights: Any = None, class_indexes: Any = None, per_image: bool = False, smooth: float = 1e-05) None
Creates a criterion to measure Dice loss.
Details:
\[L(precision, recall) = 1 - (1 + \beta^2) \frac{precision \cdot recall} {\beta^2 \cdot precision + recall}\]The formula in terms of Type I and Type II errors:
\[L(tp, fp, fn) = \frac{(1 + \beta^2) \cdot tp} {(1 + \beta^2) \cdot fp + \beta^2 \cdot fn + fp}\]- where:
tp - true positives;
fp - false positives;
fn - false negatives;
- Parameters:
beta – Float or integer coefficient for precision and recall balance.
class_weights – Array (
np.array) of class weights (len(weights) = num_classes).class_indexes – Optional integer or list of integers, classes to consider, if
Noneall classes are used.per_image – If
Trueloss is calculated for each image in batch and then averaged, else loss is calculated for the whole batch.smooth – Value to avoid division by zero.
- Returns:
A callable
dice_lossinstance. Can be used inmodel.compile(...)function or combined with other losses.
Example:
loss = DiceLoss() model.compile('SGD', loss=loss)
- segmodels_keras.losses.BinaryCELoss() None
Measures Binary Cross Entropy between ground truth (gt) and prediction (pr).
\[L(gt, pr) = - gt \cdot \log(pr) - (1 - gt) \cdot \log(1 - pr)\]- Returns:
A callable
binary_crossentropyinstance. Can be used inmodel.compile(...)function or combined with other losses.
Example:
loss = BinaryCELoss() model.compile('SGD', loss=loss)
- segmodels_keras.losses.CategoricalCELoss(class_weights: Any = None, class_indexes: Any = None) None
Measures Categorical Cross Entropy between groundtruth (gt) and prediction (pr).
\[L(gt, pr) = - gt \cdot \log(pr)\]- Parameters:
class_weights – Array (
np.array) of class weights (len(weights) = num_classes).class_indexes – Optional integer or list of integers, classes to consider, if
Noneall classes are used.
- Returns:
A callable
categorical_crossentropyinstance. Can be used inmodel.compile(...)function or combined with other losses.
Example:
loss = CategoricalCELoss() model.compile('SGD', loss=loss)
- segmodels_keras.losses.BinaryFocalLoss(alpha: float = 0.25, gamma: float = 2.0) None
Measures the Binary Focal Loss between ground truth (gt) and prediction (pr).
\[L(gt, pr) = - gt \alpha (1 - pr)^\gamma \log(pr) - (1 - gt) \alpha pr^\gamma \log(1 - pr)\]- Parameters:
alpha – Float or integer, the same as weighting factor in balanced cross entropy, default 0.25.
gamma – Float or integer, focusing parameter for modulating factor (1 - p), default 2.0.
- Returns:
A callable
binary_focal_lossinstance. Can be used inmodel.compile(...)function or combined with other losses.
Example:
loss = BinaryFocalLoss() model.compile('SGD', loss=loss)
- segmodels_keras.losses.CategoricalFocalLoss(alpha: float = 0.25, gamma: float = 2.0, class_indexes: Any = None) None
Measures Categorical Focal Loss between ground truth (gt) and prediction (pr).
\[L(gt, pr) = - gt \cdot \alpha \cdot (1 - pr)^\gamma \cdot \log(pr)\]- Parameters:
alpha – Float or integer, the same as weighting factor in balanced cross entropy, default 0.25.
gamma – Float or integer, focusing parameter for modulating factor (1 - p), default 2.0.
class_indexes – Optional integer or list of integers, classes to consider, if
Noneall classes are used.
- Returns:
A callable
categorical_focal_lossinstance. Can be used inmodel.compile(...)function or combined with other losses.
Example
loss = CategoricalFocalLoss() model.compile('SGD', loss=loss)
utils
- segmodels_keras.utils.set_trainable(model: Model, recompile: bool = True, **kwargs: Any) None
Set all layers of model trainable and recompile it.
Note
Model is recompiled using same optimizer, loss and metrics:
model.compile( model.optimizer, loss=model.loss, metrics=model.metrics, loss_weights=model.loss_weights, sample_weight_mode=model.sample_weight_mode, weighted_metrics=model.weighted_metrics, )
- Parameters:
model (
keras.models.Model) – instance of keras model.recompile – whether to recompile the model after setting trainable.
**kwargs – additional keyword arguments (unused).
- segmodels_keras.utils.save_model_weights_notop(model: Model, decoder: str, path: str | Path, overwrite: bool = True) None
Save model weights without top (without segmentation head).
The weights saved like this can be used to preload a segmentation model for fine-tuning by passing the path to these weights to
weights_notopargument of the model constructor, e.g.Unet(weights_notop="path/to/weights.h5").- Parameters:
model (
keras.models.Model) – instance of keras modeldecoder – type of the decoder part of the model. Should be one of
fpn,linknet,unet,pspnet.path (str | Path) – path to save model weights
overwrite (bool) – whether to overwrite existing file at
path. Defaults toTrue.
- segmodels_keras.utils.load_weights(model: Any, filepath: str | Path) None
Load weights from an HDF5 file into a Keras model.
This is an enhanced wrapper around
model.load_weights(filepath)that provides compatibility with both Keras 3 and legacy Keras 2 HDF5 weight files.TensorFlow/Keras 2.10 and 2.11 can only read the legacy HDF5 format (
layer_namesattribute at the root). Keras 3 switched to a newer layout where weights are stored underlayers/<name>/vars/<index>.When a file saved with Keras 3 is opened by the TF 2.10/2.11 loader the error “found 0 saved layers” is raised because the legacy reader finds no
layer_namesattribute.When a file saved with Keras 2 is opened by Keras 3, the error about “expected X variables, but received 0 variables” is raised because Keras 3 cannot read the legacy format directly.
This wrapper calls
model.load_weightsfirst. If that raises a ValueError: - If it’s a Keras 3 weights file loaded in Keras 2, it uses a custom readerthat matches each model layer to its saved counterpart by weight-shape signature and file creation order.
If it’s a Keras 2 weights file loaded in Keras 3, it delegates to Keras’ legacy HDF5 format loader.
- Parameters:
model – Keras model whose weights should be restored.
filepath (str | Path) – Path to a
.h5or.hdf5weights file.