1. Spatio-temporal Metrics ¶
In this notebook, we present some functionalities to perform Geographic Object Image Analysis (GEOBIA) with Earth Observation Data Cubes produced by the Brazil Data Cube (BDC) project using the STMETRICS package.
This notebook was designed to present some functionalities of the stmetrics python package. It provides the state-of-the-art features extraction methods for Satellite Image Time Series (SITS) that can be used for remote sensing time-series image classification and analysis.
Produce reliable land use and land cover maps to support the deployment and operation of public policies is a necessity, especially when environmental management and economic development are considered. To increase the accuracy of these maps, satellite image time-series have been used, as they allow the understanding of land cover dynamics through the time.
1.1. Getting started¶
Our first step is to install the ``stmetrics``. To do this you can easily run
!pip install git+https://github.com/brazil-data-cube/stmetrics
Moreover, to use the datacube produced by the BDC project we need to install the stac.py
package. You can do it using pip.
!pip install stac.py
!pip install git+https://github.com/brazil-data-cube/stmetrics.git
[1]:
#import modules
import stac
import numpy
import pandas
import rasterio
import stmetrics
import multiprocessing as mp
import matplotlib.pyplot as plt
#This is just to remove some annoying possible warnings
import warnings
warnings.filterwarnings('ignore')
1.2. Import test image¶
Our test image is one of the datacubes produced by the Brazil Data Cube Project. To access the project products we will use the stac.py
package. It can be installed using pip.
!pip install stac.py
Along with stac we must use some utility function to build a small datacube for our test. For this we use the following functions:
Hint: the stmetrics package uses numpy, rasterio and geopandas in most of its functions. Use help or check the documentation to undestand it better https://stmetrics.readthedocs.io/en/latest/.
[2]:
import xarray
def longlat2window(lon, lat, dataset):
"""
Args:
lon (tuple): Tuple of min and max lon
lat (tuple): Tuple of min and max lat
dataset: Rasterio dataset
Returns:
rasterio.windows.Window
"""
from pyproj import Proj
from rasterio.warp import transform
from rasterio.windows import Window
p = Proj(dataset.crs)
t = dataset.transform
xmin, ymin = p(lon[0], lat[0])
xmax, ymax = p(lon[1], lat[1])
col_min, row_min = ~t * (xmin, ymin)
col_max, row_max = ~t * (xmax, ymax)
return Window.from_slices(rows=(numpy.floor(row_max), numpy.ceil(row_min)),
cols=(numpy.floor(col_min), numpy.ceil(col_max)))
def file_to_da(filepath, bbox, crop=True):
import re
import numpy
import pandas
import rasterio
import xarray
from affine import Affine
from rasterio.warp import transform
#Open image as xarray.DataArray
da = xarray.open_rasterio(filepath)
#find datetime
match = re.findall(r'\d{4}-\d{2}-\d{2}', filepath)[-1]
da.coords['time'] = match
# Compute the lon/lat coordinates and build meshgrid
ny, nx = len(da['y']), len(da['x'])
x, y = numpy.meshgrid(da['x'], da['y'])
# Rasterio works with 1D arrays
lon, lat = transform(da.crs, {'init': 'EPSG:4326'}, x.flatten(), y.flatten())
lon = numpy.asarray(lon).reshape((ny, nx))
lat = numpy.asarray(lat).reshape((ny, nx))
#add spatial coordinates to xarray
da.coords['lon'] = (('y', 'x'), lon)
da.coords['lat'] = (('y', 'x'), lat)
#Crop xarray if requested
if crop == True:
#Get coordinates
w, s, e, n = bbox.split(',')
#To crop we just use where over lat/long
mask_lon = (da.lon >= float(w)) & (da.lon <= float(e))
mask_lat = (da.lat >= float(s)) & (da.lat <= float(n))
#get cropped xarray.DataArray
da = da.where(mask_lon & mask_lat, drop=True)
return da
def bdc2xray(stac, collection, bbox, time, bands=['ndvi']):
box = ",".join(str(x) for x in bbox)
collection = stac.collection(collection)
items = collection.get_items(filter={'bbox':box,'datetime':time, 'limit':1000})
list_of_datasets = []
dataset = xarray.Dataset()
#get links to images in Brazil Data Cube
for band in bands:
list_of_data_arrays = []
for item in range(len(items.features)):
dataarray = file_to_da(items.features[item].assets[band]['href'], box )
list_of_data_arrays.append(dataarray)
dataset[band] = xarray.concat(list_of_data_arrays, dim='time')
#load xarray
list_of_datasets.append(dataset)
bdc_xray = xarray.merge(list_of_datasets)
return bdc_xray
1.3. Connect to BDC-stac¶
Our first step is obtain a datacube, here we will use one of the datacubes produced by the BDC project. To do so, we will use the stac service provided by the project and the stac.py
package, which is also developed by the BDC team.
[3]:
bdc_stac = stac.STAC("http://brazildatacube.dpi.inpe.br/stac/")
bdc_stac
[3]:
STAC
- URL: http://brazildatacube.dpi.inpe.br/stac/
- Collections:
- LC8_30-1
- S2TOA-1
- HLS.S30-1
- MYD13Q1-1
- HLS.L30-1
- LC8DN-1
- LC8SR-1
- L5SR-1
- S2_10-1
- MOD13Q1-1
- S2_MSI_L2_SR_LASRC-1
- CB4_64-1
- S2_10_16D_STK-1
- LC8_30_16D_STK-1
- CB4_64_16D_STK-1
The above result display the list of collections and datacube available.
Now let’s define a bouding box of our interest area. This one is located in the MATOPIBA region in Brazil. For simplication purpose, for this test only NDVI will be used.
[4]:
w = -45.90
n = -12.20
e = -45.20
s = -12.90
bbox = ( w,s,e,n )
my_bands = ['NDVI']
timeline = '2018-09-01/2019-08-31'
Now we will use the utility functions available above to create an xarray.Dataset using stac.py
.
[5]:
xarray = bdc2xray(bdc_stac, "CB4_64_16D_STK-1", bbox, timeline, my_bands)
Let’s plot the xarray
[6]:
xarray.NDVI.plot(x="lon", y="lat", col="time", col_wrap=4, cmap='gray')
[6]:
<xarray.plot.facetgrid.FacetGrid at 0xf72bf48>
Now lets select some sample points to be used in our examples.
[7]:
img = numpy.squeeze(xarray.NDVI.values)
fig= plt.figure(figsize=(15, 15)) # width, height in inches
sub = fig.add_subplot(1, 2, 1)
sub.imshow(img[0,:,:], cmap = 'gray')
plt.scatter(1100,185,color='r')
plt.scatter(1000,800,color='r')
plt.title('Class: Natural Vegetation - Date 0')
sub = fig.add_subplot(1, 2, 2)
sub.imshow(img[0,:,:], cmap = 'gray')
plt.scatter(450,350,color='m')
plt.scatter(730,140,color='m')
plt.title('Class: Agriculture - Date 0')
plt.tight_layout()
1.4. Time Series extraction¶
Now let’s use our “samples” to extract some time series profiles.
[8]:
import matplotlib.dates as mdates
fn = img[:,1100,185] #Natural vegetation
agr = img[:,450,350] #Agriculture pattern
#get dates from xarray
dates = xarray.time.values
fig, ax = plt.subplots(1,2, figsize=(20,2))
ax[0].plot(dates,fn,color = 'r')
ax[0].title.set_text('Natural Vegetation')
ax[0].set_xticklabels(dates, rotation=45)
ax[1].plot(dates,agr, color = 'm')
ax[1].title.set_text('Agriculture')
ax[1].set_xticklabels(dates, rotation=45);
1.5. Polar plot¶
The stmetrics is currently composed by 4 modules:
- Metrics - With some functions to compute the all metrics available
- Basics - That has the implementation of the basics metrics
- Polar - That has the implementation of the polar metrics proposed by Körting (2013).
- Fractal - That has the implementatio of fractal metrics that are currently under assessment.
The polar approach proposed by Körting (2013) convert the time series to a polar coordinate system. This way we can plot the series in this space just to look at it and get some insights about the time series. For this, we can use the funtion polar_plot
.
The plot as you will see is composed by the series over four quadrants. The idea here is observe a stronger response over a specific period. Imagine that your time series encompass a year of observations, this way each quadrant represents 3 months.
Reference: Körting, Thales & Câmara, Gilberto & Fonseca, Leila. (2013).Land Cover Detection Using Temporal Features Based On Polar Representation.
[9]:
#Plotting our time series samples
#Natural Vegetation
print('Natural Vegetation')
stmetrics.polar.polar_plot(fn)
print('Agriculture')
#Pasture
stmetrics.polar.polar_plot(agr)
Natural Vegetation
Agriculture
1.6. Metrics computation¶
The stmetrics has two functions to compute the metrics. The get_metrics
function was designed to be used for compute the metrics of one time series. Along with the metrics, using the function you can also see the polar plot. Just remember, one series each time.
The sits2metrics
function was developed to compute the metrics over images, as the name states. For this function we use multiprocessing package to improve perfomance. Don’t worry if it is using the whole campacity of your system, it was designed to do it.
*NOTE:* This process may take sometime, tha package is computing 28 metrics!
[10]:
im_metrics = stmetrics.metrics.sits2metrics(xarray)
[13]:
im_metrics
[13]:
<xarray.Dataset> Dimensions: (band: 1, metric: 28, time: 24, x: 1207, y: 1263) Coordinates: * band (band) int32 1 * y (y) float64 9.965e+06 9.965e+06 ... 9.884e+06 9.884e+06 * x (x) float64 5.866e+06 5.866e+06 ... 5.943e+06 5.943e+06 * time (time) object '2019-09-13' '2019-08-28' ... '2018-09-13' lon (y, x) float64 -45.92 -45.92 -45.92 ... -45.18 -45.18 -45.18 lat (y, x) float64 -12.2 -12.2 -12.2 -12.2 ... -12.9 -12.9 -12.9 * metric (metric) object 'max_ts' 'min_ts' ... 'hurst_exp' 'katz_fd' Data variables: NDVI (time, band, y, x) float64 nan nan nan nan ... nan nan nan nan NDVI_metrics (metric, y, x) float64 nan nan nan nan nan ... nan nan nan nan
- band: 1
- metric: 28
- time: 24
- x: 1207
- y: 1263
- band(band)int321
array([1])
- y(y)float649.965e+06 9.965e+06 ... 9.884e+06
array([9964930.869096, 9964866.866757, 9964802.864417, ..., 9884287.921334, 9884223.918995, 9884159.916655])
- x(x)float645.866e+06 5.866e+06 ... 5.943e+06
array([5865778.261905, 5865842.25925 , 5865906.256595, ..., 5942831.06544 , 5942895.062785, 5942959.06013 ])
- time(time)object'2019-09-13' ... '2018-09-13'
array(['2019-09-13', '2019-08-28', '2019-08-12', '2019-07-27', '2019-07-11', '2019-06-25', '2019-06-09', '2019-05-24', '2019-05-08', '2019-04-22', '2019-04-06', '2019-03-21', '2019-03-05', '2019-02-17', '2019-02-01', '2019-01-16', '2018-12-31', '2018-12-18', '2018-12-02', '2018-11-16', '2018-10-31', '2018-10-15', '2018-09-29', '2018-09-13'], dtype=object)
- lon(y, x)float64-45.92 -45.92 ... -45.18 -45.18
array([[-45.92096364, -45.92036678, -45.91976992, ..., -45.20239765, -45.20180088, -45.20120411], [-45.9209464 , -45.92034953, -45.91975267, ..., -45.20237887, -45.2017821 , -45.20118533], [-45.92092915, -45.92033229, -45.91973543, ..., -45.2023601 , -45.20176332, -45.20116655], ..., [-45.89917918, -45.89858071, -45.89798224, ..., -45.17867809, -45.17807971, -45.17748134], [-45.89916184, -45.89856337, -45.8979649 , ..., -45.17865921, -45.17806084, -45.17746246], [-45.8991445 , -45.89854603, -45.89794756, ..., -45.17864034, -45.17804196, -45.17744358]])
- lat(y, x)float64-12.2 -12.2 -12.2 ... -12.9 -12.9
array([[-12.20091905, -12.2009026 , -12.20088614, ..., -12.18022986, -12.18021195, -12.18019403], [-12.20148863, -12.20147218, -12.20145573, ..., -12.18079941, -12.18078149, -12.18076357], [-12.20205822, -12.20204177, -12.20202531, ..., -12.18136895, -12.18135103, -12.18133312], ..., [-12.91859716, -12.91858066, -12.91856416, ..., -12.89785168, -12.89783372, -12.89781575], [-12.91916676, -12.91915026, -12.91913376, ..., -12.89842124, -12.89840327, -12.89838531], [-12.91973636, -12.91971986, -12.91970336, ..., -12.89899079, -12.89897283, -12.89895486]])
- metric(metric)object'max_ts' 'min_ts' ... 'katz_fd'
array(['max_ts', 'min_ts', 'mean_ts', 'std_ts', 'sum_ts', 'amplitude_ts', 'mse_ts', 'fslope_ts', 'skew_ts', 'amd_ts', 'abs_sum_ts', 'iqr_ts', 'fqr_ts', 'tqr_ts', 'sqr_ts', 'ecc_metric', 'gyration_radius', 'area_ts', 'polar_balance', 'angle', 'area_q1', 'area_q2', 'area_q3', 'area_q4', 'csi', 'dfa_fd', 'hurst_exp', 'katz_fd'], dtype=object)
- NDVI(time, band, y, x)float64nan nan nan nan ... nan nan nan nan
- transform :
- (63.9973451285607, 0.0, 5634843.84200847, 0.0, -64.00233949373292, 10096359.67324671)
- crs :
- +proj=aea +lat_1=-2 +lat_2=-22 +lat_0=-12 +lon_0=-54 +x_0=5000000 +y_0=10000000 +ellps=GRS80 +units=m +no_defs=True
- res :
- (63.9973451285607, 64.00233949373292)
- is_tiled :
- 1
- nodatavals :
- (-9999.0,)
- scales :
- (1.0,)
- offsets :
- (0.0,)
- AREA_OR_POINT :
- Area
array([[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]], [[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]], [[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ... [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]], [[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]], [[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]])
- NDVI_metrics(metric, y, x)float64nan nan nan nan ... nan nan nan nan
array([[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., ... ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]])
1.7. Image Metrics¶
The output of get_metrics
and sits2metrics
follows the same order and can be acessed with the list_metrics
function that is available at utils
module.
[11]:
stmetrics.utils.list_metrics()
[11]:
['max_ts',
'min_ts',
'mean_ts',
'std_ts',
'sum_ts',
'amplitude_ts',
'mse_ts',
'fslope_ts',
'skew_ts',
'amd_ts',
'abs_sum_ts',
'iqr_ts',
'fqr_ts',
'tqr_ts',
'sqr_ts',
'ecc_metric',
'gyration_radius',
'area_ts',
'polar_balance',
'angle',
'area_q1',
'area_q2',
'area_q3',
'area_q4',
'csi',
'dfa_fd',
'hurst_exp',
'katz_fd']
1.8. Now let’s plot the results!¶
Don’t worry if some metric is too noise or you don’t see a clear pattern, some features are not always representative.
[19]:
header = stmetrics.utils.list_metrics()
img = im_metrics.NDVI_metrics.values
plt.figure(1,figsize=(15,30))
for b,n in zip(range(1, img.shape[0]+1),header):
plt.subplot(14,2,b,)
plt.imshow(img[b-1,:,:])
plt.scatter(1100,185,color='r')
plt.scatter(1000,800,color='r')
plt.scatter(450,350,color='m')
plt.scatter(730,140,color='m')
plt.tight_layout()
plt.title(n)
plt.show()
1.9. That’s all!¶
Soon we will have more examples and usages to perform time series classifications.
Keep watching our repo on github https://github.com/brazil-data-cube/stmetrics and our documentation https://stmetrics.readthedocs.io/en/latest/!
If you find any problems in the package, please, submit an issue on github.
Thanks for checking!