{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Modelling a river in a cross-section\n",
    "\n",
    "This notebook shows how to model a river with tidal fluctuations, based on an example from an old program called WATEX, which was used to derive the head beneath levee structures for input into geotechnical stability calculations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "import timflow\n",
    "import timflow.transient as tft\n",
    "\n",
    "timflow.show_versions()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## System description\n",
    "\n",
    "The system we want to model is shown below. It consists of 4 sections:\n",
    "\n",
    "1. River: the summer bed of the river cuts deeper into the subsurface than the other sections\n",
    "2. Foreland: the winter bed of the river due to higher water levels\n",
    "3. Hinterland: the polder area protected by a dike or levee\n",
    "4. Far hinterland: same as hinterland, but optionally with different parameters.\n",
    "\n",
    "![River Cross-Section](data/watex_cross_section.png)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Aquifer properties and parameters\n",
    "\n",
    "We will define some aquifer properties and parameters for each of the sections based on subsurface data. Note that we're modeling the hinterland and far-hinterland as one section in this example.\n",
    "\n",
    "In a subsequent step the geohydrological parameters will be calibrated. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# cross-sections\n",
    "L_river = 75.00  # m, width river, though it extends to -infinity\n",
    "L_foreland = 28.00  # m, width foreland\n",
    "L_hinterland = 4000.00  # m, width hinterland, extends to +infinity\n",
    "\n",
    "x0 = -(L_river + L_foreland)  # set x=0 at river bank\n",
    "\n",
    "# layer elevations\n",
    "ztop_riverbed = 1.00  # surface level in riverbed\n",
    "ztop_foreland = 8.0  # surface level in foreland\n",
    "ztop_hinterland = 8.0  # surface level in hinterland\n",
    "ztop_aquifer = 0.0  # top of aquifer\n",
    "zbot = -30.0  # bottom of aquifer\n",
    "\n",
    "# geohydrology\n",
    "kh = 10.0  # m/d, hydraulic conductivity\n",
    "Saq = 1e-4  # 1/m, specific storage\n",
    "\n",
    "Sll_river = 1e-6  # 1/m, specific storage aquitard\n",
    "Sll_hinterland = 1e-3  # 1/m, specific storage aquitard\n",
    "\n",
    "c_river = 100.0  # d, resistance riverbed\n",
    "c_foreland = 150.0  # d, resistance foreland\n",
    "c_hinterland = 500.0  # d, resistance hinterland"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set z-arrays for model\n",
    "z_river = [ztop_riverbed, ztop_aquifer, zbot]\n",
    "z_foreland = [ztop_foreland, ztop_aquifer, zbot]\n",
    "z_hinterland = [ztop_hinterland, ztop_aquifer, zbot]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load river levels\n",
    "\n",
    "Load data from csv file containing observed water levels relative to the mean observed water level for an 8-day period. The file contains the river stage and measurements from 2 piezometers next to the river. The water level at the polder is set to a constant value of 0.0 (i.e. no changes caused by river stage fluctuations)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = pd.read_csv(\"data/watex_example.csv\", index_col=[0])\n",
    "data.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data.plot(figsize=(10, 2), grid=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that the observed fluctuations do not fluctuate around 0 for each of these locations. In order to model this system with `timflow`, we need the variation to be around 0.0, therefore, we normalize each time series by subtracting the mean. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data_norm = data - data.mean()\n",
    "data_norm.plot(figsize=(10, 2), grid=True);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build the cross-section model\n",
    "\n",
    "Now we can build the cross-section model. \n",
    "\n",
    "First, we need to define the river levels for the top boundary condition in the river and foreland sections. We want a list of time waterlevel pairs for the river water level. \n",
    "\n",
    "We're dealing with a river under the influence of tides, which means we have to be careful when specifying the river level. `timflow` maintains a given water level from a specific time until the next specified water level. \n",
    "\n",
    "The following plot shows how `timflow` inteprets the water level for different pre-processing steps on the input data. If we just pass in the river time series as is, we're continually leading the observed water level (orange line). We want to make sure the `timflow` input meets the observed level about at the midpoint (black line) of each time interval."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data_norm.loc[:1, \"rivier\"].plot(figsize=(10, 3), grid=True)\n",
    "plt.step(\n",
    "    data_norm.loc[:1].index,\n",
    "    data_norm.loc[:1, \"rivier\"],\n",
    "    where=\"post\",\n",
    "    label=\"no shift\",\n",
    ")\n",
    "plt.step(\n",
    "    data_norm.loc[:1].index,\n",
    "    data_norm.loc[:1, \"rivier\"].shift(-1),\n",
    "    where=\"post\",\n",
    "    label=\"shift -1\",\n",
    ")\n",
    "mid = (data_norm.loc[:, \"rivier\"] + data_norm.loc[:, \"rivier\"].shift(-1).values).divide(2)\n",
    "plt.step(\n",
    "    mid.loc[:1].index, mid.loc[:1], where=\"post\", label=\"midpoint\", color=\"k\", lw=1.0\n",
    ")\n",
    "plt.legend(loc=(0, 1), frameon=False, ncol=4);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# tsandhstar = data_norm[\"rivier\"].to_frame().to_records().tolist()  # original\n",
    "tsandhstar = mid.dropna().to_frame().to_records().tolist()  # midpoint\n",
    "# tsandhstar = (\n",
    "#     data_norm[\"rivier\"].shift(-1).dropna().to_frame().to_records().tolist()\n",
    "# )  # shift-1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ml = tft.ModelXsection(naq=1, tmin=1e-4, tmax=1e2)\n",
    "\n",
    "river = tft.XsectionMaq(\n",
    "    ml,\n",
    "    -np.inf,  # river extends to infinitiy\n",
    "    x0 + L_river,\n",
    "    z=z_river,\n",
    "    kaq=kh,\n",
    "    Saq=Saq,\n",
    "    Sll=Sll_river,\n",
    "    c=c_river,\n",
    "    topboundary=\"semi\",\n",
    "    tsandhstar=tsandhstar,\n",
    "    name=\"river\",\n",
    ")\n",
    "foreland = tft.XsectionMaq(\n",
    "    ml,\n",
    "    x0 + L_river,\n",
    "    x0 + L_river + L_foreland,\n",
    "    kaq=kh,\n",
    "    z=z_foreland,\n",
    "    Saq=Saq,\n",
    "    Sll=Sll_river,\n",
    "    c=c_foreland,\n",
    "    topboundary=\"semi\",\n",
    "    tsandhstar=tsandhstar,\n",
    "    name=\"foreland\",\n",
    ")\n",
    "hinterland = tft.XsectionMaq(\n",
    "    ml,\n",
    "    x0 + L_river + L_foreland,\n",
    "    np.inf,  # hinterland extends to infinity\n",
    "    kaq=kh,\n",
    "    z=z_hinterland,\n",
    "    Saq=Saq,\n",
    "    Sll=Sll_hinterland,\n",
    "    c=c_hinterland,\n",
    "    topboundary=\"semi\",\n",
    "    name=\"hinterland\",\n",
    ")\n",
    "\n",
    "ml.solve()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check the sections"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "river, foreland, hinterland"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that all sections are also stored in a dictionary where the names are used that were specified with the `name` keyword (in this case they were the same as the variable in which each section was stored). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ml.aq.inhomdict"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot a cross-section of the model. Note the water level is just schematic in this picture."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ml.plots.xsection(\n",
    "    xy=[(x0 - 25, 0), (100, 0)], labels=False, params=True, names=True, hstar=10, sep=\"\\n\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Show the summary of aquifer parameters for each section:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ml.aquifer_summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot head over time at several x-locations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t = np.linspace(0.01, 8, 150)\n",
    "xlocs = [\n",
    "    -75.0,  # in river section\n",
    "    1.0,  # at observation well 1\n",
    "    47.5,  # at observation well 2\n",
    "    100.0,  # in hinterland section\n",
    "]\n",
    "\n",
    "f, ax = plt.subplots(1, 1, figsize=(10, 3))\n",
    "for i in range(len(xlocs)):\n",
    "    h = ml.head(xlocs[i], 0.0, t)\n",
    "    ax.plot(t, h[0], label=f\"x={xlocs[i]:.1f} m\")\n",
    "\n",
    "ax.plot(data_norm.index, data_norm[\"rivier\"], color=\"k\", label=\"river\", lw=1.0)\n",
    "ax.legend(loc=(0, 1), frameon=False, ncol=5, fontsize=\"small\")\n",
    "ax.set_xlabel(\"time [d]\")\n",
    "ax.set_ylabel(\"head [m]\")\n",
    "ax.set_title(\"Simulated heads\", loc=\"right\")\n",
    "ax.grid()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot head along x for multiple times"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t = np.array([0.1, 1.0, 2.0, 3.0, 4.0, 5.0])\n",
    "x = np.linspace(x0, 1000.0, 101)\n",
    "y = np.zeros_like(x)\n",
    "\n",
    "h = ml.headalongline(x, y, t)\n",
    "\n",
    "f, ax = plt.subplots(1, 1, figsize=(10, 3))\n",
    "for i in range(len(t)):\n",
    "    ax.plot(x, h[0, i], label=\"t = {:.1f}\".format(t[i]))\n",
    "ax.legend(loc=(0, 1), frameon=False, ncol=8, fontsize=\"small\")\n",
    "ax.set_xlabel(\"x [m]\")\n",
    "ax.set_ylabel(\"head [m]\")\n",
    "ax.grid()\n",
    "ax.set_title(\"Simulated heads\", loc=\"right\")\n",
    "ax.set_xlim(x0, x.max());"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compare the model results to the observed heads in the two piezometers at $x = 1.0$ m and $x = 47.5$ m."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# piezometer locations in cross-section\n",
    "xpb_kr = 1.0\n",
    "xpb_bite = 47.5\n",
    "xlocs = [xpb_kr, xpb_bite]\n",
    "\n",
    "t = data_norm.index.to_numpy()\n",
    "\n",
    "f, axes = plt.subplots(len(xlocs), 1, figsize=(10, 6), sharex=True, sharey=True)\n",
    "\n",
    "for i in range(len(xlocs)):\n",
    "    h = ml.head(xlocs[i], 0.0, t)\n",
    "    axes[i].plot(t, h[0], label=\"model\")\n",
    "    hobs = data_norm.iloc[:, i + 1]\n",
    "    axes[i].plot(hobs.index, hobs, label=\"observed\")\n",
    "    axes[i].set_title(f\"x = {xlocs[i]} m\", loc=\"right\")\n",
    "    axes[i].legend(loc=(0, 1), frameon=False, ncol=2)\n",
    "    axes[i].grid()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Calibrate model on observed heads\n",
    "\n",
    "The model simulation is reasonable at the second observation well , but it clearly isn't as good at the river bank. Let's see if we can improve the fit of the model to the observations by calibrating the aquifer parameters.\n",
    "\n",
    "Since the data at $t=0$ shows some jumps, we want to give the model some spin-up time before we start fitting to the data. We set a `tstart` value to select only observations after this moment. \n",
    "\n",
    "The following parameters will be calibrated:\n",
    "\n",
    "- $k_h$ and $S_s$ of the aquifer, this parameter should be the same in each section.\n",
    "- resistance $c$ of the confining units, this parameter can vary within each section. \n",
    "- specific storage ($S_s$) for the leaky layers of the river and foreland, and separately the specific storage of the leaky layer in the hinterland.\n",
    "\n",
    "In total this means we have 7 parameters that can vary in this calibration step."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tstart = 2.0  # model spin-up time, calibrate on timeseries after 2 days\n",
    "\n",
    "# create calibrate object\n",
    "cal = tft.Calibrate(ml)\n",
    "\n",
    "# set observation time series\n",
    "cal.series(\n",
    "    \"kruin\",\n",
    "    xpb_kr,\n",
    "    0.0,\n",
    "    layer=0,\n",
    "    t=data_norm.loc[tstart:, \"kruin\"].index.to_numpy(),\n",
    "    h=data_norm.loc[tstart:, \"kruin\"].values,\n",
    ")\n",
    "cal.series(\n",
    "    \"binnenteen\",\n",
    "    xpb_bite,\n",
    "    0.0,\n",
    "    layer=0,\n",
    "    t=data_norm.loc[tstart:, \"binnenteen\"].index.to_numpy(),\n",
    "    h=data_norm.loc[tstart:, \"binnenteen\"].values,\n",
    ")\n",
    "\n",
    "# define parameters for calibration\n",
    "cal.set_parameter(\n",
    "    \"kaq\",\n",
    "    layers=0,\n",
    "    initial=river.kaq[0],\n",
    "    pmin=1.0,\n",
    "    pmax=30.0,\n",
    "    inhoms=(\"river\", \"foreland\", \"hinterland\"),\n",
    ")\n",
    "cal.set_parameter(\n",
    "    \"Saq\",\n",
    "    layers=0,\n",
    "    initial=river.Saq[0],\n",
    "    pmin=1e-6,\n",
    "    pmax=1e-2,\n",
    "    inhoms=(\"river\", \"foreland\", \"hinterland\"),\n",
    ")\n",
    "cal.set_parameter(\n",
    "    \"c\",\n",
    "    layers=0,\n",
    "    initial=river.c[0],\n",
    "    pmin=0.0,\n",
    "    pmax=100.0,\n",
    "    inhoms=\"river\",\n",
    ")\n",
    "cal.set_parameter(\n",
    "    \"c\",\n",
    "    layers=0,\n",
    "    initial=foreland.c[0],\n",
    "    pmin=10.0,\n",
    "    pmax=500.0,\n",
    "    inhoms=\"foreland\",\n",
    ")\n",
    "cal.set_parameter(\n",
    "    \"c\",\n",
    "    layers=0,\n",
    "    initial=hinterland.c[0],\n",
    "    pmin=10.0,\n",
    "    pmax=1000.0,\n",
    "    inhoms=\"hinterland\",\n",
    ")\n",
    "cal.set_parameter(\n",
    "    \"Sll\",\n",
    "    layers=0,\n",
    "    initial=river.Sll[0],\n",
    "    pmin=1e-8,\n",
    "    pmax=1e-4,\n",
    "    inhoms=(\"river\", \"foreland\"),\n",
    ")\n",
    "cal.set_parameter(\n",
    "    \"Sll\",\n",
    "    layers=0,\n",
    "    initial=hinterland.Sll[0],\n",
    "    pmin=1e-8,\n",
    "    pmax=1e-1,\n",
    "    inhoms=\"hinterland\",\n",
    ")\n",
    "\n",
    "# run the calibration\n",
    "cal.fit(report=False)\n",
    "# cal.fit_least_squares(report=False, method=\"trf\")\n",
    "\n",
    "# print the RMSE\n",
    "print(f\"RMSE: {cal.rmse():.5f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Show calibration parameters results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cal.parameters.iloc[:, :-2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot the locations of observation wells."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ax = river.plot(labels=False, params=False, names=True)\n",
    "foreland.plot(ax=ax, names=True)\n",
    "hinterland.plot(ax=ax, params=False, names=True)\n",
    "ml.elementlist[0].plot(ax=ax, hstar=10.0)\n",
    "ml.elementlist[1].plot(ax=ax, hstar=10.0)\n",
    "ax.set_xlim(x0, 100.0)\n",
    "ax.set_ylim(-30, 20)\n",
    "cal.xsection(ax=ax);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot the modeled vs. observed time series for both observation wells after calibration."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# piezometer locations in cross-section\n",
    "xpb_kr = 1.0\n",
    "xpb_bite = 47.5\n",
    "xlocs = [xpb_kr, xpb_bite]\n",
    "\n",
    "t = data_norm.index.to_numpy()\n",
    "\n",
    "f, axes = plt.subplots(len(xlocs), 1, figsize=(10, 6), sharex=True, sharey=True)\n",
    "\n",
    "for i in range(len(xlocs)):\n",
    "    h = ml.head(xlocs[i], 0.0, t)\n",
    "    axes[i].plot(t, h[0], label=\"model\")\n",
    "    hobs = data_norm.loc[tstart:].iloc[:, i + 1]  # showing calibration selection\n",
    "    axes[i].plot(hobs.index, hobs, label=\"observed\")\n",
    "    axes[i].set_title(f\"x = {xlocs[i]} m\", loc=\"right\")\n",
    "    axes[i].legend(loc=(0, 1), frameon=False, ncol=2)\n",
    "    axes[i].grid()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's take a look at the calibrated aquifer parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    ml.aquifer_summary()\n",
    "    .style.format(na_rep=\"-\", precision=2)\n",
    "    .format(\"{:.2e}\", subset=[\"S_s\"])\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Discussion\n",
    "\n",
    "The calibration result shows that the specific storage of the leaky layer in the hinterland section is orders of magnitude higher than the other specific storage terms. Also the resistance of the confining unit is higher in the foreland than it is in the polder. \n",
    "\n",
    "These results have to be analyzed further to determine whether these results make physical sense. The fit to the observed data is reasonable, but the problem is likely over-parametrized. It is not possible to estimate all parameters separately, since there are infinite combinations of parameters that fit the data equally well. This causes the problem to be highly sensitive to the starting values of parameters, as well as the selected optimization method.\n",
    "\n",
    "Some other points that could influence the results:\n",
    "\n",
    "- Only two piezometers in the cross-section. \n",
    "- The downwards trend in the head observations in the second piezometer probably isn't ideal as `timflow` will never be able to simulate that trend with the current input data.\n",
    "- The time series are supposedly presented relative to a mean observed head/water level, but somehow do not fluctuate around 0.0, which mean they had to be normalized (again) for the `timflow` simulation.\n",
    "- Neglecting the storage of the leaky layers yields a physically improbable solution where all resistance is put underneath the river section, and the foreland and hinterland have a resistance on the minimum parameter boundary. However, it is usually not expected that the storage in the leaky layers has such a large influence on the outcome. In many models this parameter is neglected altogether."
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}