Front-end
Building the front-end should be just a question of combining different pieces together. To simplify the design process, we are going to go with the of the shelf framework - material-ui. It comes with a bunch of ready to use components that will simplify various customizations, but for the main skeleton, react material dashboard is exactly what we need to start experimenting quickly.
Skeleton
Out of all the available data we have to choose what parts of it represent more significance. For our experiment we are going to go with the following screens:
- Dashboard - main landing screen that represent the NSW map itself, various statistics related suburbs and summarized state statistics as well.
- Cumulative totals - number of cases per day accumulated daily
- Distribution - distribution of cases within the dataset
- Correlation - pairwise corelation of different parts of metrics from the dataset
- Regression - linear regression of different parts of metrics from the dataset
Dashboard
The main landing page will show map distribution of a various statistics including: total count of cases, active cases, recovered cases and tests - all per suburb per day. As we discussed before, we can use calculated data ranges, so we can color-code them and paint different suburbs in different colors. According to Mapbox, information on the map could be grouped in layers, where a layer is a styled representation of data of a single type. We can also show-hide layers according to currently selected data slice (total case vs active case, etc.).
Here is an example of the total tests
layer:
// data ranges with keys
const caseLevels = [{
start: 0,
end: 9,
key: 9
}, {
start: 10,
end: 19,
key: 19
}, {
start: 20,
end: 29,
key: 29
}, {
start: 30,
end: 39,
key: 39
}, {
start: 40,
end: 49,
key: 49
}, {
start: 50,
key: 500
}];
// color-code schema
const getCaseColorSchema = () => ([
0, 'transparent',
caseLevels[0].key, '#ffb3b3',
caseLevels[1].key, '#ff8080',
caseLevels[2].key, '#ff4d4d',
caseLevels[3].key, '#e60000',
caseLevels[4].key, '#b30000',
caseLevels[5].key, '#660000'
]);
// layer ids
const layers = {
casesId: 'PostcodeCases',
testsId: 'PostcodeTests',
activeId: 'PostcodeActive',
recoveredId: 'PostcodeRecovered'
};
// mapbox layer for filling in map areas with specific colors
<Layer
id={layers.casesId}
source="Postcode"
type="fill"
paint={{
'fill-color': [
'interpolate',
['linear'],
['get', selectedDate],
...getCaseColorSchema()
],
'fill-opacity': 0.8
}}
/>
caseLevels
is a data structure that connects two parts together: particular layer and the dataset it represents from the features metadata.
To make the information more presentable, a popup will be shown when user clicks any particular suburb area with the following information:
- total - number of cases for selected date
- active - number of active cases for selected date
- recovered - number of recovered cases for selected date
- tests - number of test for selected date
In addition to the map, dashboard will contain summary of all datasets represented by the map for the entire history of observations:
- total cases - all suburbs
- total active - all suburbs
- total recovered - all suburbs
- total tests - - all suburbs
Cumulative totals
If first page shows current up to date information, then second page in the menu describes the situation from historical point fo vew. It presents two charts: daily cumulative totals and daily cumulative totals by suburb.
Since it's a first time we are mentioning chars in this tutorial, it is worth describing the choice of the library that handles the rendering. The specifics of the choice is related to potentially lots of data presented on any of the pages. When it comes to charts there are two main options to handel this situation: SVG based charts and canvas based charts. For our situation canvas works better just because of the performance considerations related to the amount of data (When to Use SVG vs. When to Use Canvas). We are going to go with React chars 2 because of the simplicity of use. But any of the canvas based libraries will do (Comparison of JavaScript charting libraries).
Daily cumulative totals
Daily cumulative totals is a summary of number of cases in two dimensions split in days: new cases and total number of cases. Horizontal axis presents dates where vertical axis is a combination of two metrics: new cases and total cases from the begging of the observations up to date. Creating this type of chart could be accomplished using a combination of a Bar chart and a Line chart with two datasets, one for daily cases - Bar chart, and one for totals - Line chart:
const chartData = {
datasets: [{
backgroundColor: colors.red[500],
label: 'Daily new cases',
data: data.daily,
order: 1
},
{
backgroundColor: colors.indigo[100],
label: 'Cumulative totals',
data: data.cumulative,
type: 'line',
order: 2
}],
labels: dates
};
Daily cumulative totals by suburb
Daily cumulative totals by suburb shows exactly the same type of the chart as daily cumulative totals chart does but in the context of the specified suburb. The only specific front-end feature that makes life easier is the ability to select multiple suburbs at the same time and present multiple charts for them. For that autocomplete component from material-ui
library works best:
const SuburbSelect = ({ onChange }) => {
const classes = useStyles();
const { suburbs } = useContext(DataContext);
return (
<Autocomplete
multiple
disableCloseOnSelect
className={classes.root}
options={suburbs}
getOptionLabel={(option) => `${option.postCode} ${option.name}`}
renderInput={(params) => <TextField {...params} label="Post code / Suburb" variant="outlined" />}
renderOption={(option) => renderOption(option, classes)}
onChange={(_, value) => onChange(value)}
/>
);
};
Distribution
Following summary statistics idea, third page will display distribution for the same datasets presented by the first page per suburb (total case, active, recovered, tests). To represent distribution graphically, we need to calculate a Histogram of cases per suburb and render it using a Bar chart.
Correlation
So far we've been dealing with different parts of the main dataset by themselves. For the correlation, we are comparing parts against each other to see if there are any dependencies between them (changes in on part/metric correlates to the changes in the other one - Correlation).
Correlation could be represented both numerically (correlation coefficient) and graphically. For the graph/chart part we are rendering dependent variable relative to the independent one in a Scatter Plot chart. Both independent and dependent variables are measured against the date the measurement was taken. To give more visibility to the user, we are displaying all the corelation pairs in the rectangular grid where columns and rows represent corresponding variables. Their intersections represent correlation result (graphical or numeric) and the diagonal is empty since correlation of the variable with itself is always calculated as 1 coefficient value - maxim strong correlation (Multiple correlation). For the Scatter plot, the trick is to prepare data for it in the way that dependant variable is aligned with the independent one. In our context that means form an array of two variables by pair-wise merge (zip) and filter the array by independent variable.
const sortArray = (data, compare) => {
data.sort(compare);
return data;
};
const data = useMemo(() => {
if (!cases.has(date)) {
return [];
}
const entry = cases.get(date);
const mainData = population.reduce((acc, curr) => {
if (!entry.has(curr.POA_NAME16.toString())) {
return acc;
}
const caseEntry = entry.get(curr.POA_NAME16.toString());
acc.push({
population: curr.Tot_p_p,
cases: caseEntry.Cases,
active: caseEntry.Active,
recovered: caseEntry.Recovered,
tests: caseEntry.Tests
});
return acc;
}, []);
const result = {
population: {
cases: sortArray(
mainData.map((item) => ({ x: item.population, y: item.cases })),
(a, b) => a.x - b.x
),
active: sortArray(
mainData.map((item) => ({ x: item.population, y: item.active })),
(a, b) => a.x - b.x
),
recovered: sortArray(
mainData.map((item) => ({ x: item.population, y: item.recovered })),
(a, b) => a.x - b.x
),
tests: sortArray(
mainData.map((item) => ({ x: item.population, y: item.tests })),
(a, b) => a.x - b.x
),
},
cases: {
population: sortArray(
mainData.map((item) => ({ x: item.cases, y: item.population })),
(a, b) => a.x - b.x
),
active: sortArray(
mainData.map((item) => ({ x: item.cases, y: item.active })),
(a, b) => a.x - b.x
),
recovered: sortArray(
mainData.map((item) => ({ x: item.cases, y: item.recovered })),
(a, b) => a.x - b.x
),
tests: sortArray(
mainData.map((item) => ({ x: item.cases, y: item.tests })),
(a, b) => a.x - b.x
),
},
active: {
population: sortArray(
mainData.map((item) => ({ x: item.active, y: item.population })),
(a, b) => a.x - b.x
),
cases: sortArray(
mainData.map((item) => ({ x: item.active, y: item.cases })),
(a, b) => a.x - b.x
),
recovered: sortArray(
mainData.map((item) => ({ x: item.active, y: item.recovered })),
(a, b) => a.x - b.x
),
tests: sortArray(
mainData.map((item) => ({ x: item.active, y: item.tests })),
(a, b) => a.x - b.x
),
},
recovered: {
population: sortArray(
mainData.map((item) => ({ x: item.recovered, y: item.population })),
(a, b) => a.x - b.x
),
cases: sortArray(
mainData.map((item) => ({ x: item.recovered, y: item.cases })),
(a, b) => a.x - b.x
),
active: sortArray(
mainData.map((item) => ({ x: item.recovered, y: item.active })),
(a, b) => a.x - b.x
),
tests: sortArray(
mainData.map((item) => ({ x: item.recovered, y: item.tests })),
(a, b) => a.x - b.x
),
},
tests: {
population: sortArray(
mainData.map((item) => ({ x: item.tests, y: item.population })),
(a, b) => a.x - b.x
),
cases: sortArray(
mainData.map((item) => ({ x: item.tests, y: item.cases })),
(a, b) => a.x - b.x
),
active: sortArray(
mainData.map((item) => ({ x: item.tests, y: item.active })),
(a, b) => a.x - b.x
),
recovered: sortArray(
mainData.map((item) => ({ x: item.tests, y: item.recovered })),
(a, b) => a.x - b.x
)
}
};
return result;
}, [population, cases, date]);
Regression
Since correlation coefficient attempts to establish a line of best fit through a dataset of two variables by essentially laying out the expected values and the resulting correlation coefficient indicates how far away the actual dataset is from the expected values, it makes sense to have another page that shows just that line (Linear regression). As for the chart, we can use combined effort of the Line chart for the regression and a Scatter Plot chart to repeat correlation results as well.
const chartData = {
datasets: [{
data,
label: name,
tooltipLabel: yName,
backgroundColor,
order: 2
}, {
data: lineData,
label: 'Regression line',
tooltipLabel: `Regression ${yName}`,
type: 'line',
fill: false,
order: 1
}]
};