Business & Investing
Sponsored by

Data Science Legal Question for the board

1,655 Views | 5 Replies | Last: 4 yr ago by Ulrich
aeroag14
How long do you want to ignore this user?
AG
I am a bit of an aspiring data scientist and have,over the last couple years, compiled a ton of sports related statistics into a personal database. I have, to this point only referenced the database for personal use. However, it has recently occurred to me that I could leverage the database to create a product that I could then sell for profit. My question relates to the legality of using the statistics I have collected in a commercial sense.

From my (very limited) legal understanding, statistics, like scores, rushing yards, turnovers, etc, etc are considered to be "facts" and therefore not subject to copyright. Yet, many websites prohibit the use of data collected on their website for commercial purposes. I assume, since it is explicitly stated, there is a valid legal reason. I was just curious if anyone understood/could explain what the deal is.

Additionally, I am curious how one would legally go about creating a database that could be used for commercial purposes. Presumably you could watch every single game and calculate each statistic by hand and that would be legal, but physically impossible. You could collect every newspaper from around the country and hand pick out statistics from a box score, also pretty much impossible. Really the only other way would be to look up the statistics of interest online. So how could you do it? There are services that sell sports data, but even they presumably didnt watch literally every game to compile their databases, how did they do it? On an even more fundamental level, it seems hard to believe that there is not a straight forward process for being able to legally collect publicly available "facts".

Again, I have no real legal background and am mainly just curious on what the deal is
Engine10
How long do you want to ignore this user?
AG
Not a lawyer but from NFL's T&C section you just need to obtain permission to access and pull data from the API?
Quote:

You may link to the home page of the Services without obtaining our permission. For any other type of link to the Services, you must obtain our express written permission. To seek our permission, you may write to Legal Department, Attn: NFL.com, National Football League, 345 Park Avenue, 7th Floor, New York, NY 10154. If you provide a third-party Web site that links to the Services, you: (a) shall not create a frame, browser or border environment around any of the content of the Services; (b) shall not imply that we endorse or sponsor your Web site or any of its products or services; (c) shall not present false information about us, the Services or any of our products or services; (d) shall not use any of our trademarks without our express prior written permission; and (e) shall not include any content that could be construed by us as distasteful, offensive or controversial. Notwithstanding anything to the contrary contained in this Agreement, we reserve the right to deny or rescind permission to link to the Services from any Web site, and to require termination of any link to the Services, for any reason in our sole and absolute discretion.
sam88
How long do you want to ignore this user?
AG
I can't help out with the legal details, but, if you are looking to sell the data, I think you will have some strong competition from already available sources. Here's a few that have exhaustive stats-
https://www.sports-reference.com/cfb/
https://console.cloud.google.com/marketplace/details/ncaa-bb-public/ncaa-basketball
http://mlb.mlb.com/stats/sortable.jsp?c_id=mlb

If you are an aspiring data scientist, another option is to share your work on an open source project, github, reddit. Good luck.
one MEEN Ag
How long do you want to ignore this user?
AG
aeroag14 said:

I am a bit of an aspiring data scientist and have,over the last couple years, compiled a ton of sports related statistics into a personal database. I have, to this point only referenced the database for personal use. However, it has recently occurred to me that I could leverage the database to create a product that I could then sell for profit. My question relates to the legality of using the statistics I have collected in a commercial sense.

From my (very limited) legal understanding, statistics, like scores, rushing yards, turnovers, etc, etc are considered to be "facts" and therefore not subject to copyright. Yet, many websites prohibit the use of data collected on their website for commercial purposes. I assume, since it is explicitly stated, there is a valid legal reason. I was just curious if anyone understood/could explain what the deal is.

Additionally, I am curious how one would legally go about creating a database that could be used for commercial purposes. Presumably you could watch every single game and calculate each statistic by hand and that would be legal, but physically impossible. You could collect every newspaper from around the country and hand pick out statistics from a box score, also pretty much impossible. Really the only other way would be to look up the statistics of interest online. So how could you do it? There are services that sell sports data, but even they presumably didnt watch literally every game to compile their databases, how did they do it? On an even more fundamental level, it seems hard to believe that there is not a straight forward process for being able to legally collect publicly available "facts".

Again, I have no real legal background and am mainly just curious on what the deal is
AeroAg,

Awesome that you're collecting data. While I can't answer about the legal questions, I do have a couple questions that I think you could ask yourself.

-What service does the data provide? Can that data be obtained elsewhere already?
-If you want to start a business,do you need to create the database yourself anyway? the value won't be in the data itself but the insights you can gain ontop of it, especially if its easy to come by data like sports statistics.
-Long term do you want to be a sales/service shop of data or just transition into a data analytics role in your career? I think you could make a cool blog and blast people on LinkedIn about it and that will help you solidify your career aspirations of data analytics moreso than trying to sell this data.

Just some musings from someone else who also has an interest in data analytics beyond their day job.
Proposition Joe
How long do you want to ignore this user?
The data itself are considered "facts", the issue stems from how you acquire the data.

If you are using a site's API to amass the data then you are fine. If you are scraping it off their website and/or violating any of their terms of use, then you've left yourself open to legal action if you are turning around and trying to sell that scraped data for a profit.

All that being said, pretty much every statistic out there has already been compiled in various databases (some public, some private). It was a booming industry about 10 years ago. If there's a algorithm that you've developed for a predictive stat there is obviously some value in that, but 99 times out of 100 those algorithms don't stand the test of time.
ABATTBQ11
How long do you want to ignore this user?
AG
What sport, and would you ever share it for fun or educational use?
Ulrich
How long do you want to ignore this user?
Sports statistics and analytics is an industry with quite a few established companies, so in addition to compiling data you'll need a hook. Either data no one else has, or buy the data and come up with more convincing analytics than anyone else.

I play around with basketball analytics for my own curiosity, but I wouldn't want to be trying to break in professionally, not now and not without connections in both tech and front offices.
Refresh
Page 1 of 1
 
×
subscribe Verify your student status
See Subscription Benefits
Trial only available to users who have never subscribed or participated in a previous trial.