I was amazed this morning to read that there is another version of me living on the internet. Not the "me" I present on social media or my blog, but a version of me that has been curated out of a mix of information I've voluntarily shared, data gathered while I've been browsing the web, buying services and products but also data that has been generated using sophisticated modelling techniques to predict my likes, dislikes, purchasing habits and more. Let's call her my "data twin".

Whilst I'm not surprised that this data is being collected, I was surprised to learn the types of predictions these models are making.

When Carl Miller requested access to his data-self he discovered some interesting facts - the likelihood of him being interested in gardening was, for example, 23.3% and it had been determined that his household did not have a regular interest in reading books. I wonder with what accuracy a model would be able to guess my hobbies? Does this explain why news stories about famous chefs and bakers often appear in my news feed?

I thought I'd test this by reviewing the data Facebook has on me. In the last year, Facebook has made it a lot easier for you to see the advertising categories it has assigned to you to based on your activities within the app. I was interested to see that they had accurately guessed I've recently joined the gym and had listed me as interested in "physical fitness". Though it also seems to think I'm interested in "automobiles" despite the fact I cannot drive.

Who is benefiting from my data twin?

These models are incredibly useful for companies to be able to target with a great deal of precision which adverts to push out to us and even at what time we will be most receptive to them. I'm not ashamed to say that I know these methods work. I am often surprised at how accurately a product has been marketed to me and have bought products and services as a result. It would seem that perhaps my data twin resembles me quite well.

But, I wonder what I could learn about myself by requesting access to my data twin and whether I could take back control of the types of news stories and adverts that are being fed to me on a daily basis.

In a similar attempt to help people gain back control of their data-self, Experian has launched a campaign to show consumers how they can work with the financial data that has been collected about them to improve, for example, their access to credit.

As the internet of things continues to grow, we are only going to continue creating more data, feeding our data twin. We all have a choice as to whether we help to shape what our data twin looks like by regularly reviewing what data on us has been collected and correcting any errors/erroneous assumptions.

Personally, I've always liked the idea of having an identical twin.

I decided to try to reconstruct my own data doppelganger - to come face-to-face with myself as I exist in the data, and so to understand a little more about the ecology of exchanges and brokers, suppliers and analytics firms that have built a version of...well, myself.