1. Hello, I am just trying to launch my startup. It is based on the development of an application. Among other things, this application aims at scraping people's data from public sources. Thus I have a few questions. What should I do to scrap aligned to GDPR?
Answer: When scrapping data, one of the most important things you need to do is to determine your role as either a data controller or a data processor. This mostly depends on who your clients are. Secondly, you need to ensure that the scrapped data is not subject to exemptions under the GDPR, such as data processed for the public interest.
2. If in the future I wish to upgrade it with machine learning techniques, could I use the data stored?
Answer: This depends on what data you are storing. If it's personal data, then you will most likely be acting as a data controller and you should have a valid lawful ground for processing and you should provide an adequate Privacy Notice to the data subjects.
3. Summarizing, what "privacy by design" strategies should I take into a ccount for both aspects?
> 1. Thanks, Andrei, very useful! Could you go into more depth about "scraping people's data from public sources"? Are these sources somehow infomediaries, communication media,..?
Answer: Public sources refer, for example, to news websites or websites where details of the individual can be found under certain circumstances, for example, a writer's biography or public officials' CVs.
> 2. A valid lawful ground would be based on just publishing an adequate Privacy Notice at the web to let data subjects know details, or should we also need also to directly contact them and get their consent? Thanks in advance
Answer: Depends on how you provide the scrapping service. I guess that someone would come to you and ask you to scrap his data, thus you would need to have the permission of that person.
> If a crawler like Google is allowed to crawl data from webs, may I crawl Google data with an app? In other words, am I allowed to crawl a crawler?
Google, like many others, has Privacy Notices, Privacy Policies as well as other documents and tools to enable users to understand what is happening with their data and to allow them to exercise some of their rights. I am not saying they are perfect, not by a long shot but they at least have something. And an army of well-prepared lawyers. Still, they failed numerous times and have just paid 50 mil euros fine in France.
So, long story short - you can try to act like Google but you need to be ready to face the consequences if you don't comply with the privacy laws especially the EU GDPR.