Patrol Data Validation

Some of the questions I get about problems in SMART are a result of errors in patrol data: erroneous waypoints, or invalid tracks. A long time ago we talked about data validation; we should consider this again. We could do a variety of things here but a very simple check might be to ensure that all points are within x KM of the conservation area boundary.

Leave a comment

on 2013-12-24 12:36 *

By pkayal

Description changed from Some of the questions I get... to Some of the questions I get...

on 2014-05-15 12:48 *

By jeffloun2

Milestone set to SMART Future (1)

on 2015-08-13 09:43 *

By SMARTadmin

Milestone changed from SMART Future (1) to For UC Review

on 2015-08-13 09:48 *

By rbergl

This is also related to #1357 and #784. I think all these tickets should be combined into a single issue.

on 2015-08-13 09:49 *

By SMARTadmin

Assigned to set to rbergl

on 2015-09-11 12:51 *

By rbergl

Ideas for implementation:
-Somewhere (perhaps under conservation area properties) set either an AOI, outside of which points would be flagged/deleted/ignored, or set a maximum distance from the CA boundary for this.
-Set a maximum speed from track point to track point which would cause points to be flagged/deleted/ignored.
-Implement something akin to a “data cleaning query” interface, where a user can run a query, then select points for deletion/editing.

on 2015-09-11 13:23 *

By rbergl

Milestone changed from For UC Review to 4.0 Watershed 1

on 2015-09-11 13:24 *

By rbergl

Assigned to changed from rbergl to pkayal

on 2015-09-11 14:14 *

By pkayal

Assigned to changed from pkayal to -none-

Patrol Data Validation is not one of the Watershed 1.0 tickets.

on 2015-09-11 14:14 *

By pkayal

Milestone changed from 4.0 Watershed 1 to For UC Review

on 2015-09-12 06:59 *

By estokes

I think we'd also want to flag patrols with 0km distance (i.e. if a patrol tracklog wasn't present)

I would see this as a 'patrol data validation' feature in the patrol window interface (perhaps as an additional tab) as it would primarily be used to validate just after patrol import from CT or patrol manual data entry

on 2015-10-20 17:09 *

By rbergl

Milestone changed from For UC Review to Pipeline – High Priority

on 2016-10-03 16:12 *

By jeffloun2

from Emma's email Oct 2, 2016, additional ideas for automated data verification:
"i)Checks for patrols with no coordinates and no data (typically errors when using the CT plug-in).
ii) Check for duplications (..similar to) when you add a new employee "

re: ii) we do check for patrol duplicates now, ie if you are loading the same patrol again it should warn you that it looks like a duplicate. Maybe this is in reference to a duplicate observation. This could be tricky as you might see a lot more false-duplicates than actual duplicates and just bog down the data loading process?

on 2017-01-26 15:54 *

By pkayal

RRI Discussion:
This is open-ended and depends on what we implement.
The steering committee discussed the speed option as the best indicator for finding bad points. i.e. For ground and water patrols anything that moves more than 150km/hr is a bad point and should be reported to the import user as probable bad data. For air patrols we increase the threshold to 1000km/hr.
Ideas:
-Possibly make this all configurable: for each transport type have a max speed to check?
-Check for points outside conservation area boundary?
-Might need a UI to manually run this validation and/or save configuration for running things automatically[RAB1] .
RRI Update: This needs a lot of thought. We would like to know the precise list of checks that are required. And what should happen if a bad point is found? [In February 2015 Refractions proposed a comprehensive solution – please see the Appendix “Data Validation” of this document. We would want to revisit this and design it properly.]

[RAB1]Yes, I think this needs to be configurable. Speeds will be different for foot, vehicle, air, etc. Also, would be good to be able to manually assign the layer which is the boundary for possible points, rather than have it just be CA boundary.

From an email dated 3 February 2015:
“we propose making a plug-in that provides a reusable QA framework that includes roughly the following:
-A GUI that provides of list of all available "QA processes" that have been added
-Some tools to select, and run these algorithms on a selected set of data (patrols, incidents, missions. Using filters for things like transport type, and dates to let the user select a sub-set of data)
-A framework to show the results of these algorithms as a table of results
-All the plug-in packaging and code necessary to allow additional algorithms to be easily added etc.
-A QA algorithm that allows users to select any number of the built in shape files layers (CA boundary, buffer, administrative areas etc) and show all point that fall outside of all the selected Layers (The default would be the CA boundary + the CA Buffer I imagine). This process would then have all the points listed in the table and the user can select any number of the points and delete them from the SMART database.
-Another QA process users can run, that checks the speed of patrols based on time and distance between each point, then highlights any point pairings where the speed was over the user-entered maximum threshold[RAB1] .
-The ability to double-click a returned point and have it open the correct patrol and leg to allow users to inspect the point in more detail to quickly determine if the point is valid or not if necessary[RAB2] .

We've estimated the above tasks at about 3 weeks of effort. This would then allow us to add other algorithms that use the above framework quite easily, without any new UI components needed which are often time consuming.

Mapping Option:
We could also include a mapping window in the above framework to allow users to quickly and easily select and highlight points they wish to inspect and see them on a map in relation to the CA's 5 layers and all the other points found during the QA process. This would add about 1 week of effort, we think there is a reasonable chance this would be a useful tool for any QA processes where users will need to make a judgment call on validity and need as much information as possible.

Additional QA algorithm:
The other QA process we've discussed in the past is to clean patrol track data by removing numerous points that are all in the same area, mostly due to a gps device that was left on when the patrol had stopped to rest or sleep. This algorithm could be added to the above framework, but we are not sure of the exact method of detecting and solving this issue yet. We estimate about 5 days to detail and implement an approach to solving this issue and adding it as an available tool in the QA framework plug-in.”[RAB3]

[RAB1]I think this is the most important check. TO me the main issue is bad GPS points, rather than patrols conducted in the wrong place.
[RAB2]Overall I think this is a good approach. There needs to be significant customizability by transport type (i.e. different thresholds assigned to different speeds). There also needs to be a manual (as described here) and a fully automated option (where QA/outlier detection is performed automatically and outliers are interpolated to something more reasonable).
[RAB3]Yes, a smoothing feature like this would be good. Possibly based on averaging all points within a certain radius to a single point/calculating a centroid.

on 2017-02-02 14:19 *

By rbergl

Please see also see ticket 784.

I think the approach outlined above, in combination with 784 is what we are looking for. We can probably drop the QA algorithm that cleans up numerous points in the same place.

on 2017-02-07 11:48 *

By pkayal

Status changed from New to Invalid

I've added a patrol speed QA check to ticket 784, and thus I am closing this ticket.

Drop the files anywhere in this page to upload them as attachments.

Add a Relation

Attachments

Related Tickets

Followers

Patrol Data Validation

Related Tickets

Add people from your team or external to follow ticket activity

Followers will receive email updates about new ticket activity or emails sent to smart-cs+827@tickets.assembla.com

Attachments

Related Tickets

Followers

Patrol Data Validation

Granting access, please wait...

Related Tickets

Add people from your team or external to follow ticket activity

Followers will receive email updates about new ticket activity or emails sent to smart-cs+827@tickets.assembla.com