Welcome to the regression section of the Python Machine Learning series. Here, you should have already installed Scikit-Learn. If not, install it, as well as Pandas and Matplotlib.
Pip install numpy
Pip install scipy
Pip install scikit-learn
Pip install matplotlib
Pip install pandas
In addition to the import of these tutorials, we also use Quandl here:
Pip install quandl
First of all, what is the return for us to use for machine learning? Its goal is to accept continuous data, find the equations that best fit the data, and be able to predict specific values. With simple linear regression, you can do this just by creating a best fit straight line.
Here, we can use the equation of this line to predict future prices, where the date is the x-axis.
The hot usage of regression is to predict stock prices. Since we will consider the price flow over time and use a continuous data set to try to predict the next current price in the future, we can do so.
Regression is a type of supervised machine learning, that is, scientists show them characteristics and then show them the correct answers to teach the machine. Once the machine is taught, the scientist can test the machine with some invisible data, where the scientist knows the correct answer, but the machine doesn't know. The machine's answer is compared to the known answer and measures the accuracy of the machine. If the accuracy is high enough, scientists will consider using their algorithms in the real world.
Since regression is widely used for stock prices, we can use an example to start here. In the beginning, we need data. Sometimes data is easy to get, sometimes you need to go out and collect it yourself. Here we can start with at least simple stock price and volume information, they come from Quandl. We will crawl Google’s stock price, and its code is GOOGL:
Import pandas as pd
Import quandl
Df = quandl.get("WIKI/GOOGL")
Print(df.head())
Note: At the time of this writing, Quandl's modules use uppercase Q references, but now they are lowercase q, so import quandl.
Here we have:
Open High Low Close Volume Ex-Dividend \
Date
2004-08-19 100.00 104.06 95.96 100.34 44659000 0
2004-08-20 101.01 109.08 100.50 108.31 22834300 0
2004-08-23 110.75 113.48 109.05 109.40 18256100 0
2004-08-24 111.24 111.60 103.57 104.87 15247300 0
2004-08-25 104.96 108.00 103.88 106.00 9188600 0
Split RaTIo Adj. Open Adj. High Adj. Low Adj. Close \
Date
2004-08-19 1 50.000 52.03 47.980 50.170
2004-08-20 1 50.505 54.54 50.250 54.155
2004-08-23 1 55.375 56.74 54.525 54.700
2004-08-24 1 55.620 55.80 51.785 52.435
2004-08-25 1 52.480 54.00 51.940 53.000
Adj. Volume
Date
2004-08-19 44659000
2004-08-20 22834300
2004-08-23 18256100
2004-08-24 15247300
2004-08-25 9188600
This is a very good start, we have the data, but a bit more.
Here, we have a lot of columns, many of which are superfluous, and some that don't change much. We can see that the columns for regular and modified (Adj) are duplicates. The revised column looks even better. The regular column is the price of the day, but the stock has something called a spin-off, and one of them suddenly becomes two shares, so the price of one share is halved, but the value of the company does not change. The revised list is adjusted for stock splits, which makes them more reliable for analysis.
So let's go ahead and cut the original DataFrame.
Df = df[['Adj. Open', 'Adj. High', 'Adj. Low', 'Adj. Close', 'Adj. Volume']]
Now we have the revised column and the volume. There are some things to note. Many people talk about or hear about machine learning, just like the black magic that comes out of nothing. Machine learning can highlight existing data, but the data needs to exist first. You need meaningful data. So how do you know if it makes sense? My best advice is to simply simplify your brain. Consider, will historical prices determine future prices? Some people think so, but over time it has proven to be wrong. But what about the law of history? It makes sense when it stands out (machine learning can help), but it is still too weak. So, the relationship between price changes and volume over time, plus the laws of history? Maybe a little better. So, you can already see that not the more data, the better, but we need to use useful data. At the same time, the raw data should be converted.
Consider the daily fluctuations, such as the percentage difference between the highest price and the lowest price? What about the daily percentage change? Do you think Open, High, Low, Close is simple, is Close, Spread/VolaTIlity, %change daily better? I think the latter is a little better. The former are very similar data points, the latter based on the former's unified data, but with more valuable information.
Gear Sensor has been widely used in the automotive and industrial field, which is important to the measurement of velocity, angel, angular velocity, direction of rotation.
Gear Sensor,Custom Gear Sensor,Gear Sensor 3 Pins,Good Gear Sensor
Yuheng Optics Co., Ltd.(Changchun) , https://www.yhencoder.com