Data Processing
Basic data processing functions are provided in the dataprocessing
module. These functions are used to preprocess data before training a model, or to post-process the output of a model.
Generic Functions
plantseg.functionals.dataprocessing.dataprocessing.normalize_01(data: np.ndarray, eps=1e-12) -> np.ndarray
Normalize a numpy array between 0 and 1 and converts it to float32.
Parameters:
-
data
(ndarray
) –Input numpy array
-
eps
(float
, default:1e-12
) –A small value added to the denominator for numerical stability
Returns:
-
normalized_data
(ndarray
) –Normalized numpy array
Source code in plantseg/functionals/dataprocessing/dataprocessing.py
307 308 309 310 311 312 313 314 315 316 317 318 |
|
plantseg.functionals.dataprocessing.dataprocessing.scale_image_to_voxelsize(image: np.ndarray, input_voxel_size: tuple[float, float, float], output_voxel_size: tuple[float, float, float], order: int = 0) -> np.ndarray
Scale an image from a given voxel size to another voxel size.
Parameters:
-
image
(ndarray
) –Input image to scale
-
input_voxel_size
(tuple[float, float, float]
) –Input voxel size
-
output_voxel_size
(tuple[float, float, float]
) –Output voxel size
-
order
(int
, default:0
) –Interpolation order, must be 0 for segmentation and 1, 2 for images
Returns:
-
scaled_image
(ndarray
) –Scaled image as numpy array
Source code in plantseg/functionals/dataprocessing/dataprocessing.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
|
plantseg.functionals.dataprocessing.dataprocessing.image_rescale(image: np.ndarray, factor: tuple[float, float, float], order: int) -> np.ndarray
Scale an image by a given factor in each dimension
Parameters:
-
image
(ndarray
) –Input image to scale
-
factor
(tuple[float, float, float]
) –Scaling factor in each dimension
-
order
(int
) –Interpolation order, must be 0 for segmentation and 1, 2 for images
Returns:
-
scaled_image
(ndarray
) –Scaled image as numpy array
Source code in plantseg/functionals/dataprocessing/dataprocessing.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
|
plantseg.functionals.dataprocessing.dataprocessing.image_median(image: np.ndarray, radius: int) -> np.ndarray
Apply median smoothing on an image with a given radius.
Parameters:
-
image
(ndarray
) –Input image to apply median smoothing.
-
radius
(int
) –Radius of the median filter.
Returns:
-
ndarray
–np.ndarray: Median smoothed image.
Source code in plantseg/functionals/dataprocessing/dataprocessing.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
|
plantseg.functionals.dataprocessing.dataprocessing.image_gaussian_smoothing(image: np.ndarray, sigma: float) -> np.ndarray
Apply gaussian smoothing on an image with a given sigma.
Parameters:
-
image
(ndarray
) –Input image to apply gaussian smoothing
-
sigma
(float
) –Sigma value for gaussian smoothing
Returns:
-
smoothed_image
(ndarray
) –Gaussian smoothed image as numpy array
Source code in plantseg/functionals/dataprocessing/dataprocessing.py
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
|
plantseg.functionals.dataprocessing.dataprocessing.image_crop(image: np.ndarray, crop_str: str) -> np.ndarray
Crop an image from a crop string like [:, 10:30:, 10:20]
Parameters:
Returns:
-
cropped_image
(ndarray
) –Cropped image as numpy array
Source code in plantseg/functionals/dataprocessing/dataprocessing.py
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 |
|
plantseg.functionals.dataprocessing.dataprocessing.process_images(image1: np.ndarray, image2: np.ndarray, operation: ImagePairOperation, normalize_input: bool = False, clip_output: bool = False, normalize_output: bool = True) -> np.ndarray
General function for performing image operations with optional preprocessing and post-processing.
Parameters:
-
image1
(ndarray
) –First input image.
-
image2
(ndarray
) –Second input image.
-
operation
(str
) –Operation to perform ('add', 'multiply', 'subtract', 'divide', 'max').
-
normalize_input
(bool
, default:False
) –Whether to normalize the input images to the range [0, 1]. Default is False.
-
clip_output
(bool
, default:False
) –Whether to clip the resulting image values to the range [0, 1]. Default is False.
-
normalize_output
(bool
, default:True
) –Whether to normalize the output image to the range [0, 1]. Default is True.
Returns:
-
ndarray
–np.ndarray: The resulting image after performing the operation.
Source code in plantseg/functionals/dataprocessing/dataprocessing.py
361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 |
|
Segmentation Functions
plantseg.functionals.dataprocessing.labelprocessing.relabel_segmentation(segmentation_image: np.ndarray, background: int | None = None) -> np.ndarray
Relabels contiguously a segmentation image, non-touching instances with same id will be relabeled differently. To be noted that measure.label is different from ndimage.label.
1-connectivity 2-connectivity diagonal connection close-up
[ ] [ ] [ ] [ ] [ ]
| \ | / | <- hop 2
[ ]--[x]--[ ] [ ]--[x]--[ ] [x]--[ ] | / | \ hop 1 [ ] [ ] [ ] [ ]
Parameters:
-
segmentation_image
(ndarray
) –A 2D or 3D segmentation image where connected components represent different instances.
-
background
(int | None
, default:None
) –Label of the background. If None, the function will assume the background label is 0. Default is None.
Returns:
-
ndarray
–np.ndarray: A relabeled segmentation image where each connected component is assigned a unique integer label.
Source code in plantseg/functionals/dataprocessing/labelprocessing.py
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
plantseg.functionals.dataprocessing.labelprocessing.set_background_to_value(segmentation_image: np.ndarray, value: int = 0) -> np.ndarray
Sets all occurrences of the background (label 0) in the segmentation image to a new value.
Parameters:
-
segmentation_image
(ndarray
) –A 2D or 3D numpy array representing an instance segmentation.
-
value
(int
, default:0
) –The value to assign to the background. Default is 0.
Returns:
-
ndarray
–np.ndarray: A segmentation image where all background pixels (originally 0) are set to
value
.
Source code in plantseg/functionals/dataprocessing/labelprocessing.py
118 119 120 121 122 123 124 125 126 127 128 129 |
|
Advanced Functions
plantseg.functionals.dataprocessing.advanced_dataprocessing.fix_over_under_segmentation_from_nuclei(cell_seg: np.ndarray, nuclei_seg: np.ndarray, threshold_merge: float = 0.33, threshold_split: float = 0.66, quantiles_nuclei: tuple[float, float] = (0.3, 0.99), boundary: np.ndarray | None = None) -> np.ndarray
Corrects over-segmentation and under-segmentation of cells based on a trusted nuclei segmentation.
This function uses information from nuclei segmentation to refine cell segmentation by first identifying over-segmented cells (cells mistakenly split into multiple segments) and merging them. It then corrects under-segmented cells (multiple nuclei within a single cell) by splitting them based on nuclei position and optional boundary information.
Parameters:
-
cell_seg
(ndarray
) –A 2D or 3D array representing segmented cell instances.
-
nuclei_seg
(ndarray
) –A 2D or 3D array representing segmented nuclei instances, with the same shape as
cell_seg
. -
threshold_merge
(float
, default:0.33
) –Threshold for identifying over-segmentation, based on the ratio of nuclei overlap. Cells with overlap below this threshold will be merged. Default is 0.33.
-
threshold_split
(float
, default:0.66
) –Threshold for identifying under-segmentation, based on the ratio of nuclei overlap. Cells with overlap above this threshold will be split. Default is 0.66.
-
quantiles_nuclei
(tuple[float, float]
, default:(0.3, 0.99)
) –Quantile range for filtering nuclei based on size, helping to ignore outliers such as very small or very large nuclei. Default is (0.3, 0.99).
-
boundary
(ndarray | None
, default:None
) –An optional boundary probability map for the cells. If None, a constant map is used to treat all regions equally. This can help refine under-segmentation correction.
Returns:
-
ndarray
–np.ndarray: The corrected cell segmentation array, of the same shape as the input
cell_seg
.
Source code in plantseg/functionals/dataprocessing/advanced_dataprocessing.py
256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 |
|
plantseg.functionals.dataprocessing.advanced_dataprocessing.remove_false_positives_by_foreground_probability(segmentation: np.ndarray, foreground: np.ndarray, threshold: float) -> np.ndarray
Removes false positive regions in a segmentation based on a foreground probability map.
- Labels are not preserved.
- If the mean(an instance * its own probability region) < threshold, it is removed.
Parameters:
-
segmentation
(ndarray
) –Segmentation array where each unique non-zero value indicates a distinct region.
-
foreground
(ndarray
) –Foreground probability map of the same shape as
segmentation
. -
threshold
(float
) –Probability threshold below which regions are considered false positives.
Returns:
-
ndarray
–np.ndarray: Segmentation array with false positives removed.
Source code in plantseg/functionals/dataprocessing/advanced_dataprocessing.py
305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 |
|