Questions mostly concerned with managing data, without focus on pre-processing or modelling.
Questions tagged [data]
866 questions
13
votes
3 answers
How to create US state choropleth map
I have a value associated with each US state (let's pretend it's the average temperature in January for each state). I want to display this data as a heat map of the United States. To be clear, it would be a map of the US with each state having a…

user15180
- 131
- 1
- 1
- 3
7
votes
4 answers
Docker for data science
I recently started to read articles about Docker.
To me, in data science, Docker is useful because:
1) You have a totally different environment, which protect you against libraries and dependencies problems.
2) If your application modify, for…

nolwww
- 233
- 1
- 10
6
votes
3 answers
How to understand data (What questions do you ask yourself)?
I'm curious as to see what questions other people ask themselves when faced with new data. What are some common questions you ask yourself when trying to understand the data or performing EDA? What labels do you deem necessary? Is there a 'correct'…

zampoan
- 85
- 1
- 5
3
votes
4 answers
Mathematics major for data science
So I'm a recent transfer 2nd year student from Computer Science major to Mathematics major. Though I do have a bit of an issue here. I can choose between the applied mathematics, pure mathematics and statistics concentrations.
Along with this major,…

user6214
- 33
- 1
- 3
3
votes
1 answer
How do I best visualize this voltage data for a science project
I'm helping my son with his 7th grade science project. We've had a good deal of fun with our experiments with Solar Arrays and charging 12 volt UPS batteries! But, I am not sure how to interpret the data!
Our original hypothoesis was that the lenght…

user3687778
- 31
- 1
2
votes
0 answers
What proxies could be used to assess economic value of Stackoverflow for its users?
I'm intrigued by the open data provided by StackExchange, and have been running some really interesting queries on the data.stackexchange.com page (using the Stackoverflow dataset).
In particular, I would like to dive deep on this claim taken from…

Andrea M
- 23
- 3
2
votes
3 answers
How to store complex tables and structures?
I'm wondering about general approaches to storing complex tables and structures. For example, imagine I have a table like this:
A1 A2 A3 A4 A5 B1 B2 B3 B4 B5 C1 ....Z5
individuals
1 . . . . . . . . . .
2 …

skan
- 185
- 6
2
votes
1 answer
Count the number of false positives with respect to the first class
Can anybody tell me the formula how to find the number of false positives with respect to the first class?
where $y$ is the truth / target and $a(x)$ is the prediction

Bootuz
- 121
- 2
2
votes
0 answers
What are some real world examples of a 2d data set where agglomerative methods work and K means doesnt? (for two real groups of data)
What is 2D dataset with two real groups of data where an agglomerative clustering method will be able to cluster the data correctly, and K-means will not?
I thought about a data set that would look like the figure below, but still I can't think of…

Emran Tamimi
- 21
- 1
2
votes
1 answer
PCA on acceleration time series data
Has anyone attempted Principal component analysis on time series and acceleration data(ie. data from accelerometer and sensors) and tried compressing it as well as regenerating back the data with minimum error?
I wanted to know if this is possible…

user16058
- 21
- 1
1
vote
2 answers
How to Make Meaningful Conclusions here?
I recently appeared for an Interview for my college and I was asked the following question. The Interviewer said that this question was a Data Science question.
The question-
Suppose 7.5% of the population has a certain Bone Disease. During COVID…

FoundABetterName
- 117
- 8
1
vote
1 answer
Meaningful and Non-Meaningful Data?
I can understand what meaningful data is like its important information that can be used to evaluate something but I don't get what non-meaningful data is? Is it less important data?

ThomasJH
- 13
- 2
1
vote
1 answer
Data science degree or Computer Science degree?
I'm going to be applying for college soon and some of my choices have a data science undergrad. I've taken some online courses on introductory data analysis and have been doing kaggle projects for awhile, so I know it's something I enjoy. The only…

Ethan Shapiro
- 11
- 2
1
vote
2 answers
Do I need a datalake in my use case?
My web application stores usage data, for example:
tickets opened an closed
tasks executed
user scores
etc. I need to show dashboards and reports for usage and performance trends, like:
How many tickets where opened/closed in a period?
what is…

Glasnhost
- 111
- 1
1
vote
2 answers
How to scale numbers according to proportions
I have some data that I'm trying to represent in a visualization with a week-over-week change. So the source can be something like:
Fruit This Week Last Week Delta Proportion
--------------------------------------------------
Bananas …

I_Play_With_Data
- 2,089
- 3
- 16
- 40