Data mining recently made big news with the Cambridge Analytica scandal, but it is not just for ads and politics. It can help doctors spot fatal infections and it can even predict massacres in the Congo.
Hosted by: Stefan Chin
Head to https://scishowfinds.com/ for hand selected artifacts of the universe!
Support SciShow by becoming a patron on Patreon: https://www.patreon.com/scishow
Dooblydoo thanks go to the following Patreon supporters: Lazarus G, Sam Lutfi, Nicholas Smith, D.A. Noe, سلطان الخليفي, Piya Shedden, KatieMarie Magnone, Scott Satovsky Jr, Charles Southerland, Patrick D. Ashmore, Tim Curwick, charles george, Kevin Bealer, Chris Peters
Looking for SciShow elsewhere on the internet?
Image Source: https://commons.wikimedia.org/wiki/File:Swiss_average.png
Also, what’s scary about data mining? If target knows you are pregnant, what’s the harm? That you get access to coupons that save money on the things you are gonna purchase for your baby anyway? I remember a time when I bought a sous vide, added modernest cuisine and an isi dispenser to my wishlist, watched a few cool science videos on liquid nitrogen and suddenly a video was suggested about cryosteak. It’s a really cool thing, I learned about something I didn’t know before. The coolest part was that something was suggested to me that was right in line with my interest at the time. This sounds like education at its finest. I know it’s education targeting my wallet, I went straight to amazon and added A liquid nitrogen dewar to my wishlist. None the less, I learned something extremely relevant to my interests because a huge set of data about people with similar interests was both available and examined by computers to suggest that video.
So maybe I’m missing something. I like learning so please point it out if I am. But what’s so scary about collecting data. It’s not like that judgie suburban housewife down the street is poking through the data collected about me so she can judge more. And if she is, what do I care if she’s got nothing better to do than judge others? I don’t expect she can be a very enlightened, satisfied, happy person if that’s what she is choosing to spend her time doing. And I can comfort myself with the fact that I’m having a far nicer time than she.
Please, someone, tell me why I should be scared and why I should prevent the tracking at all costs? What am I missing?
New tech for same fuckery (to paraphrase Sagan)
only want suggestions for things I already like so I can stifle potential growth in taste/knowledge.
...and inflation of crime stats in poor hoods from disproportionate policing used as justification for more cops & arrests there.
Please never lump weather(wo)men and their fake models and real science trying to predict things in an overwhelmingly rigged world.
I do appreciate your description of virtually all EDA and machine learning algos as applied statistics, cause as a man with an adv degree in Applied Math and Statistics that's ALL I've seen.
Idk about the AIX (Explainable AI) stuff...I have to look at it soon.
I've been trying to datamine this old game that I used to play on the first XBOX, It's called "Area 51" and it was made by Midway Studios. Can someone help? It was ported to PC, and that's what I'm trying to mine.
shocking accuracy? I kept seeing baby stuff ads based solely on my gender and age I guess. I worked hard marking all diapers ads as sexually explicit. now I'm getting dresses and jewelry ads and webinars on how to find myself, find myself a man and be happy. it's nowhere near being accurate, it's pretty much generic. forget that I'm a qualified professional with tons of hobbies, who cares. they are leading me to buying diapers.
I took the Dual Enrollment Computer Science class at UT Austin and one of the modules explained data-mining and data collection. I learned so much from that class.
Data-mining doesn't surprise me anymore as a result. And it makes sense: Google is practically the King of data-mining. Even if you're not using any of their services, if you visit a website with Google Ads, unless you turn off third-party cookies, they can still collect your data.
Be careful of what you do on the Internet.
"Data Mining is more about spotting patterns than explain them." That's not very accurate. Data mining is very much about explaining patterns. That's why decision trees & Bayesian classifiers are data mining algorithms and something such as a neural network is not. Though all can produce similar results, the former are considered data mining algorithms because they do describe the patterns they find, while a neural network obfuscates anything it discovers.
hes talking about target and baby due dates when we all know that data mining is being used for a lot more than that its scary how much information these companies have on us and its definitly not right
Nice video. Regression and classification are indeed very, very similar. The math is almost identical. Anomaly detection is like model diagnosis, looking for points that don't fit the model.
The one big thing people need to recall is Goodhart's Law: "Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes."
So these regularities will frequently die when they start being used for some purpose.
This is market research on steroids. I think it included inventorying (association learning), no? It should be regulated, but you can't ban it, any more than you can ban advertising itself. It should not be used for ill. Vote with yiur dollars.
I swear it's not just text data they mine, I swear someone (something) is listening to my cell phone microphone. I was talking with a co-worker who asked about Sirius (with my cell on the desk), and we were talking about the Octane station. I mentioned that I hadn't heard an Alter Bridge or Tremonti song on there yet. That very evening at work, I went on to youtube to play a music mix, and you want to know what band popped up in "my mix" that hadn't been in there before? Yep. Alter Bridge. This isn't the first time, which makes it not just a coincidence. I swear I'm going to make a tin foil hat and a bunker soon.
I heared sex workers had a similar issue. fb was locating them and matching clients with the non-sexwork profiles, endangeting the workers. But the thing is- the SW didn't live near the clients, didn't go to any similar places, and never took their private phones with them to encounters. They refused to admit they were locating then somehow even though there is no way it was a coincidence, since it happened to so many workers.
Wow... I've not been keeping close track but 5+ m subs is a lot!
Has is been growing really fast lately or have I not paid enough attention?
Gratz on 5m!
Awesome and deserved, lots of hard work and time but its paid off!
Great episode! A minor quibble: I would have put a bit more emphasis on the fact that clustering doesn't require any training data, whereas both classification and regression usually do, though there are many unsupervised methods for all of these tasks.
Data mining can reveal a whole lot and can be pretty reliable, but then everyone hates it if people use profiling methods. Even though profiling, using a ton of data points on people (including race, gender, age, family status etc), can reveal the ones most likely to bring a bomb on a plane, or have mental issues, shoot up a school, or commit other crimes. This can speed up long security lines and lower costs. I for one am perfectly fine with profiling.
Go automation go. Now instead of sneaking around for data we aught to give it more data than we think it can handle. Wearable medical monitors to feed an AI doctor with everything to do with our health. If enough of us feed an AI doctor it could learn "healthy" numbers from "sick" numbers, further learn "worrying" numbers from "good trend" numbers. Then instead of waiting for worrying trends to become a chronic condition we can head it off with simple preventative treatments.
Please tell me I'm not the only one that thought of Data from Star Trek Next Generation every time he said Data in the intro of this video. I kind of want to edit it so Data appears on screen every time he says Data. Not going to, but it would be funny :)
I get targeted Chinese women products and I'm a male adult from Guatemala, lmao. Probably because I stopped using most social networks, except a meme app (which only gave me ads for itself) and Twitter.
This still has a long way to go, it works for simpletons and simple minded people that literally spreads their likes on easy to follow patterns even a human could see.
"A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed.
While a hierarchical data warehouse stores data in files or folders, a data lake uses a flat architecture to store data."
Y'know, I straight-up filled out google's ad targeting and interest questionnaire because I hoped it would stop showing me (under 21, a vegetarian, no driver's license) ads that were irrelevant to me, but I still get cheeseburgers driving specific cars down to the local brewery. Kinda wish the algorithms were even better at creepily mining all my deepest darkest secrets.
Data mining at its most trivial is annoying. If I like a certain 12 string guitar, it may actually be the voice that goes with it, rather than the 12 string. If I source a plumbing bit, it may be for an art piece. If I want to see conductors' batons, I really am not looking for Harry Potter wands. The end result is homogeneity in tastes and interests. An interest in guerrilla gardening does not indicate a particular cant or esthetic or avocation. Maybe I'm just researching bentonite.
Yeah, so? That's the whole point, you think Spotify is selling their subscription to a 30 year old mostly listening to Heavy Metal, Punk and Indie Rock by forcing him to listen to a genre that is completely irrelevant to him? You could argue that it's a negative reinforcement to buy a subscription to get rid of the ads. However positive reinforcement ALWAYS works better (e.g. "Oh look, Spotify really knows my taste and I didn't have to dig for hours for these really sweet tracks!") But even when I'm not subscribing, they can sell their commercial breaks for a fortune to external companies, IF they can prove it's aimed at the right demographic. And obviously they don't even try...
Some algorithms can spot people with bipolar disorder, and also recognize when they're going to be manic - and display ads for expensive-but-exciting stuff when the user is likely to be not in control of their own actions.
In theory I’m all for data mining if it makes the product better, but I’d really like if companies would just ask me for it and I would give them the data that I want instead of them just finding every little scrap of information (like age and sexual orientation... that’s just creepy)
We will accept entries between now and June 15th. Posters will ship in July.
YESTERDAY WAS EVERYTHING OUT JUNE 30th.
Filmed primarily during the tour celebrating the 10th anniversary of our debut album, this feature length documentary, directed by our friend Matthew Mixon, follows the band as we reunite with our original vocalist Jesse for the first time since our split a decade prior. The film explores the fatal tragedy that brought the band together and follows our journey across North America as we face old ghosts and attempt to reconcile the past.
Signal Spam is a public-private partnership that allows users to report anything that they consider to be spam in their e-mail client or webmail in order to assign it to the public authority or the professional that will take the required action to combat the reported spam.
The Spam Signal reflex.
A spam report allows to collect all the technical information required for the identification of a spammer, wether the report relates to a marketing abuse or cyber-criminal spam. Signal Spam is responsible for the qualification of your report and distributing useful information to the fight against spam.
Download the plugin that corresponds to your messaging environment and install it.
Report spam from your e-mail and track developments in your personal space.
Thanks to your reports, Signal Spam collects information essential to the identification of spammers , and share them with the authorized actors able to take action adapted to each specific report.
Consult the code of ethics.
The reports provide the digital evidence investigators and public authorities need engage legal procedures, controls and sanctions against companies which send abusive marketing e-mails, and legal actions against cyber criminals.
Easy-To-Use Tools For Hard Trading Decisions.
Find what to trade, when to trade, and how to trade with signals and tools for over 350,000 stocks, ETFs, futures, forex and mutual funds.
Managing your own portfolio is easier than you think.
Create Your MarketClub Account Now.