Sunday 14 April 2013

The search for accurate data

As we discussed last post, we are having issues determining what data is accurate and what data is not accurate. In today's post we discuss the steps we have taken to get more information and be able to get more accurate data.

We first sorted the data by adding a simple filter to strip out points where the cats are travelling at 0kph and at speeds of faster than 10kph.

However cats can go faster than 10kph, so our initial filter based on speeds wasn't a very good way of sorting out good data from bad data.  According to Google a cat can pretty much travel at 5kph indefinitely and get up to speeds of 50kph over short distances.  So yes, Magellan could easily travel further distances than he does. However, I still don't believe he made it to the industrial zone in the next suburb over in the space of 20 seconds.

So let me explain how the data is gathered.

Warning: This post contains Technical Information alert.  I'll add cute cat photos below to keep the non geeks interested.

The GPS logger is attached to the cat's collar at the start of the day. They are fed and then the cat door is opened.

In the evening, the cats are called inside, fed and locked inside. The GPS logger is removed and plugged into the computer to charge and for us to download the data.

The data from the device is retrieved using a program called @trip. The data is then exported from into gpx format. The gpx data that is produced using @trip only provides data on latitude, longitude, elevation and speed.

The gpx file format does support other data such as recording the number of satellites used, horizontal dilution of precision and vertical dilution of precision.

So the question becomes can the GPS logger actually record that additional information about satellites and precision?


So we went to Google for answers.

Corsair likes it when we sit still and play on the computer.



After a long time Googling we found a document that someone had assembled explaining how the data is actually stored and extracted from the GPS unit.  They had gathered this information by reverse engineering the protocol that @trip uses to communicate with the GPS. They used this to build an open source program called igotu2gpx.  

We installed igotu2gpx and used the program to extract some data from the logger and export it as gpx.

This data set contained longitude, latitude, elevation and number of satellites. But it didn't present data on the horizontal dilution of precision or speed.

We wanted to know speed and the horizontal dilution of precision so we had to change the program to do that. It was surprisingly easy to do, as the guys who had written the software had already found those values in the data and made them available, it just wasn't being shown in the data outputs.  So with a little C++ scripting the software was adapted to give us the data we wanted. Hooray for Open Source Software!

Magellan also has excellent coding skills.



Next step is to build the PostGIS database.  And possibly clean our desks before posting pictures of them to the internet :)

Next post will present some maps using our new refined data and seeing if that makes a difference.

Also Magellan didn't come home for dinner a couple of nights ago and so we have some night time data. It will be interesting to see is there is a difference in the night time versus day time roaming habits.

And Nigel wants to buy two more GPS loggers so we can track all three cats at once....

If you are enjoying this blog, or would like to critique our methodology, please comment and let us know.

Thanks for reading








2 comments:

  1. Thoroughly enjoying it, great science :)

    ReplyDelete
  2. I linked to your blog from Ravelry and am still enjoying reading your findings as you explore the cats world and work on narrowing down your data coming in.

    ReplyDelete