Week 4 – Working with Distributions and DataFrames.¶

# Import the required packages
using Distributions, DataFrames

# Seed the random number generator
srand(1234);

# Question 4: Create the 3 x 30 array named array_1
# 30 rows and 3 columns array
array_1 = [rand(30) rand(30) rand(30)]
size(array_1)
array_1

30×3 Array{Float64,2}:
 0.590845   0.931115   0.643704 
 0.766797   0.438939   0.401421 
 0.566237   0.246862   0.525057 
 0.460085   0.0118196  0.61201  
 0.794026   0.0460428  0.432577 
 0.854147   0.496169   0.082207 
 0.200586   0.732      0.199058 
 0.298614   0.299058   0.576082 
 0.246837   0.449182   0.218177 
 0.579672   0.875096   0.362036 
 0.648882   0.0462887  0.204728 
 0.0109059  0.698356   0.932984 
 0.066423   0.365109   0.827263 
 ⋮                              
 0.0566425  0.404953   0.0396356
 0.842714   0.499531   0.79041  
 0.950498   0.658815   0.431188 
 0.96467    0.515627   0.137658 
 0.945775   0.260715   0.60808  
 0.789904   0.59552    0.255054 
 0.82116    0.292462   0.498734 
 0.0341601  0.28858    0.0940369
 0.0945445  0.61816    0.52509  
 0.314926   0.66426    0.265511 
 0.12781    0.753508   0.110096 
 0.374187   0.0368842  0.834362

# Question 5: Mean and variance of column 1
mean_column_1 = mean(array_1[:,1])
var_column_1=var(array_1[:,1])
println("mean=",mean_column_1)
println("var=",var_column_1)

mean=0.5014887976938368
var=0.10653465363277906

# Question 5 (continued): Mean and variance of column 2
mean_column_2 = mean(array_1[:,2])
var_column_2=var(array_1[:,2])
println("mean=",mean_column_2)
println("var=",var_column_2)

mean=0.4160447968360426
var=0.06360439983290869

# Question 5 (continued): Mean and variance of column 3
mean_column_3 = mean(array_1[:,3])
var_column_3=var(array_1[:,3])
println("mean=",mean_column_3)
println("var=",var_column_3)

mean=0.4372634519427959
var=0.07568707224628725

# Question 6: Import array_1 into a DataFrame named df
df = DataFrame(array_1)

# check available names and fieldnames in Julia, Python's alternative
f_name =fieldnames(df)
name=names(df)
println(f_name,name)

Symbol[:columns, :colindex]Symbol[:x1, :x2, :x3]

# Accessing different columns of df
df[:x3]

30-element Array{Float64,1}:
 0.643704 
 0.401421 
 0.525057 
 0.61201  
 0.432577 
 0.082207 
 0.199058 
 0.576082 
 0.218177 
 0.362036 
 0.204728 
 0.932984 
 0.827263 
 ⋮        
 0.0396356
 0.79041  
 0.431188 
 0.137658 
 0.60808  
 0.255054 
 0.498734 
 0.0940369
 0.52509  
 0.265511 
 0.110096 
 0.834362

# Question 7: Change the names of the columns to Var1, Var2, and Var3
rename!(df,Dict(:x1=>:Var1,:x2=>:Var2,:x3=>:Var))

### we can also tail function see last required entries
tail(df,20)

# Creatring Second DataFrame
df2=DataFrame(tail(df,20))

# Question 9: Calculate simple descriptive statistics of all the columns in df2 using the describe() function
describe(df2)

Var1
Summary Stats:
Mean:           0.484341
Minimum:        0.010906
1st Quartile:   0.108001
Median:         0.510439
3rd Quartile:   0.826549
Maximum:        0.964670
Length:         20
Type:           Float64

Var2
Summary Stats:
Mean:           0.397753
Minimum:        0.036884
1st Quartile:   0.277730
Median:         0.368842
3rd Quartile:   0.601180
Maximum:        0.753508
Length:         20
Type:           Float64

Var
Summary Stats:
Mean:           0.453279
Minimum:        0.039636
1st Quartile:   0.136423
Median:         0.464961
3rd Quartile:   0.778998
Maximum:        0.932984
Length:         20
Type:           Float64

# Question 10: Add a column to df2 named Cat1 to df2 consisting of randomly selecting either the strings GroupA or GroupB
df2 = hcat(df2, rand(["GroupA","GroupB"],20))
rename!(df2,Dict(:x1=>:Cat1))

# Question 11: Create a new DataFrame named df3
df3 = DataFrame(A=1:20,B=21:40,C=41:60)

# Question 12: Change indicated values to empty entries
#In a code cells below, change the values in df3 of the following cells to NA: row 10, column 1, row 15, column 2 and row #19, column 3
df3[10,1] = NA
df3[15,2] = NA 
df3[19,3] = NA
df3

# Question 13: Create DataFrame df4 that contains no rows with NaN (NA) values
df4 = completecases!(df3)

Title: Week 3 – Fitting a Curve¶

# Initilization of Plots Package
using Plots
pyplot()

Plots.PyPlotBackend()

Reading data from given Sample file¶

data_tofit = readdlm("Week3_PR_Data.dat", '\t', header=true)
typeof(data_tofit)

Tuple{Array{Float64,2},Array{AbstractString,2}}

Using For loop to print data in array¶

new_array=data_tofit[1]
for i in 1:size(new_array)[1]
    println(new_array[i,:])
end

[0.501309, -0.977698]
[1.52801, 0.527711]
[1.70012, 1.71152]
[1.99249, 1.891]
[2.70608, -0.463428]
[2.99493, -0.443567]
[3.49185, -1.27518]
[3.50119, -0.6905]
[4.45992, -5.51613]
[4.93697, -6.0017]
[5.02329, -8.36417]
[5.04234, -7.92448]
[5.50739, -10.7748]
[5.56867, -10.9172]

Scatter plot¶

# Create the arrays x and y, assigning x the first column of data_tofit and y the second column
x,y = new_array[:,1],new_array[:,2]
scatter(x,y)

Creating parabfit() one-liner function¶

# Create a function called parabfit, with x as the argument, returning a*x^2 + b*x + c
parabfit(x)=a*x^2 + b*x + c

parabfit (generic function with 1 method)

Ploting against Default values of a,b and c¶

a = 1
b = 1
c = 1

plot(parabfit,-2,2)

Ploting using different range for parabfit()¶

# Create variables a, b and c, assigning each the value 1
a = 1
b = 1
c = 1

# Plot the function parabfit, for x values between -5 and 5 
plot(parabfit,-5,5)

# More plot!() tries.
a,b,c = 1,1,1
scatter(x_axis,y_axis)
plot!(parabfit,-5,5)

UndefVarError: x_axis not defined

Stacktrace:
 [1] include_string(::String, ::String) at ./loading.jl:515

Optimize parameters a, b and c such that it fits the data points more concisely.

Parbola should be downwards that detarmines cofficient a must be negative.
As from the data points value of cofficient c should be close to zero.
Cofficient b determines the values of y axis that must be possitive.

# More plot!() tries.
a,b,c = -1,2,3
scatter(x,y)
plot!(parabfit,-5,5)

# More plot!() tries.
a,b,c = -1,0.1,2
scatter(x_axis,y_axis)
plot!(parabfit,-5,5)

UndefVarError: x_axis not defined

Stacktrace:
 [1] include_string(::String, ::String) at ./loading.jl:515

# More plot!() tries.
a,b,c = -1,0.8,3
scatter(x,y)
plot!(parabfit,-5,5)

# More plot!() tries.
a,b,c = -0.9,2.7,0.05
scatter(x,y)
plot!(parabfit,-5,5)

Optimiseing Each Variable seprately¶

Optimising variable c¶

a,b = 1,1
plot(scatter(x,y,alpha=0.5))
c=0
plot!(parabfit,-5,5)
c = -1
plot!(parabfit,-5,5)
c = -2
plot!(parabfit,-5,5)
c = -3
plot!(parabfit,-5,5)
c = -4
plot!(parabfit,-5,5)
c = -5
plot!(parabfit,-5,5)
c = 2
plot!(parabfit,-5,5)

Optimising Variable a¶

c,b = 1,1
plot(scatter(x,y,alpha=0.5))
a=0
plot!(parabfit,0,5)
a = -1
plot!(parabfit,0,5)
a = -2
plot!(parabfit,0,5)
a = -3
plot!(parabfit,0,5)
a = -4
plot!(parabfit,0,5)
a = -5
plot!(parabfit,0,5)
a = 2
plot!(parabfit,0,5)

#Locating final value for a
c,b = 3,1
plot(scatter(x,y,alpha=0.5))
a = -1
plot!(parabfit,0,5)

Optimising for b¶

c,a = 2,-1
plot(scatter(x,y,alpha=0.5))
b=0
plot!(parabfit,0,5)
b = 1
plot!(parabfit,0,5)
b = 2
plot!(parabfit,0,5)
b = 3
plot!(parabfit,0,5)
b = 4
plot!(parabfit,0,5)
b = 5
plot!(parabfit,0,5)
b = -1
plot!(parabfit,0,5)

# plotting for b=4
c,a = 1,-1
plot(scatter(x,y,alpha=0.5))
b = 3
plot!(parabfit,0,8)

final Values of a,b and c¶

# plotting for b=4
c,a,b = 1,-1,3
plot(scatter(x,y,alpha=0.5))
plot!(parabfit,0,5)

To optimize values of a,b,c we had to plot one variable many times to find out one variable’s occurrence at different levels
of scale. By changing the range of parabola function it was more easy to come up with more accurate values of a,b and c

	x1	x2	x3
1	0.5908446386657102	0.9311151512445586	0.6437042811826996
2	0.7667970365022592	0.43893895933102156	0.40142056533714965
3	0.5662374165061859	0.24686248047491066	0.5250572942486489
4	0.4600853424625171	0.011819583479107054	0.6120098074984683
5	0.7940257103317943	0.046042826396498704	0.43257652982765626
6	0.8541465903790502	0.496168672722459	0.0822070287962946
7	0.20058603493384108	0.7320003814997245	0.19905799020907944
8	0.2986142783434118	0.29905752670238184	0.5760819730593403
9	0.24683718661000897	0.4491821088563024	0.21817706596841413
10	0.5796722333690416	0.8750962647851142	0.3620355262053865
11	0.6488819502093455	0.046288741031345504	0.20472832290217324
12	0.010905889635595356	0.6983555060532487	0.93298350850828
13	0.06642303695533736	0.3651093677271471	0.8272627957034728
14	0.9567533636029237	0.3024777928234499	0.09929915955881308
15	0.646690981531646	0.3725754415996787	0.6342997886044144
16	0.11248587118714015	0.15050782744925795	0.1327153585755645
17	0.2760209506672211	0.14732938279328955	0.7751941503856596
18	0.6516642063795697	0.2834013103457036	0.8692366891234362
19	0.05664246860321187	0.40495283364883794	0.039635617270926904
20	0.8427136165865521	0.49953074411487797	0.7904095314876494
21	0.9504984071553011	0.6588147837334961	0.43118828904466633
22	0.9646697763820897	0.5156272179795256	0.1376583132625555
23	0.9457754052519123	0.26071522632820776	0.6080803126880718
24	0.7899036826169576	0.5955204840509289	0.2550540600167448
25	0.8211604203482923	0.2924615242315285	0.4987340031883092
26	0.03416010848943718	0.2885798506061561	0.09403688346569439
27	0.09454448946400307	0.6181597973815087	0.5250899072103514
28	0.31492622391998415	0.6642598175011505	0.2655109248498748
29	0.12780989889368866	0.7535081177709988	0.11009621399607639
30	0.374186714831074	0.03688418241886171	0.8343616661080064

	Var1	Var2	Var
1	0.5908446386657102	0.9311151512445586	0.6437042811826996
2	0.7667970365022592	0.43893895933102156	0.40142056533714965
3	0.5662374165061859	0.24686248047491066	0.5250572942486489
4	0.4600853424625171	0.011819583479107054	0.6120098074984683
5	0.7940257103317943	0.046042826396498704	0.43257652982765626
6	0.8541465903790502	0.496168672722459	0.0822070287962946
7	0.20058603493384108	0.7320003814997245	0.19905799020907944
8	0.2986142783434118	0.29905752670238184	0.5760819730593403
9	0.24683718661000897	0.4491821088563024	0.21817706596841413
10	0.5796722333690416	0.8750962647851142	0.3620355262053865
11	0.6488819502093455	0.046288741031345504	0.20472832290217324
12	0.010905889635595356	0.6983555060532487	0.93298350850828
13	0.06642303695533736	0.3651093677271471	0.8272627957034728
14	0.9567533636029237	0.3024777928234499	0.09929915955881308
15	0.646690981531646	0.3725754415996787	0.6342997886044144
16	0.11248587118714015	0.15050782744925795	0.1327153585755645
17	0.2760209506672211	0.14732938279328955	0.7751941503856596
18	0.6516642063795697	0.2834013103457036	0.8692366891234362
19	0.05664246860321187	0.40495283364883794	0.039635617270926904
20	0.8427136165865521	0.49953074411487797	0.7904095314876494
21	0.9504984071553011	0.6588147837334961	0.43118828904466633
22	0.9646697763820897	0.5156272179795256	0.1376583132625555
23	0.9457754052519123	0.26071522632820776	0.6080803126880718
24	0.7899036826169576	0.5955204840509289	0.2550540600167448
25	0.8211604203482923	0.2924615242315285	0.4987340031883092
26	0.03416010848943718	0.2885798506061561	0.09403688346569439
27	0.09454448946400307	0.6181597973815087	0.5250899072103514
28	0.31492622391998415	0.6642598175011505	0.2655109248498748
29	0.12780989889368866	0.7535081177709988	0.11009621399607639
30	0.374186714831074	0.03688418241886171	0.8343616661080064

	Var1	Var2	Var
1	0.6488819502093455	0.046288741031345504	0.20472832290217324
2	0.010905889635595356	0.6983555060532487	0.93298350850828
3	0.06642303695533736	0.3651093677271471	0.8272627957034728
4	0.9567533636029237	0.3024777928234499	0.09929915955881308
5	0.646690981531646	0.3725754415996787	0.6342997886044144
6	0.11248587118714015	0.15050782744925795	0.1327153585755645
7	0.2760209506672211	0.14732938279328955	0.7751941503856596
8	0.6516642063795697	0.2834013103457036	0.8692366891234362
9	0.05664246860321187	0.40495283364883794	0.039635617270926904
10	0.8427136165865521	0.49953074411487797	0.7904095314876494
11	0.9504984071553011	0.6588147837334961	0.43118828904466633
12	0.9646697763820897	0.5156272179795256	0.1376583132625555
13	0.9457754052519123	0.26071522632820776	0.6080803126880718
14	0.7899036826169576	0.5955204840509289	0.2550540600167448
15	0.8211604203482923	0.2924615242315285	0.4987340031883092
16	0.03416010848943718	0.2885798506061561	0.09403688346569439
17	0.09454448946400307	0.6181597973815087	0.5250899072103514
18	0.31492622391998415	0.6642598175011505	0.2655109248498748
19	0.12780989889368866	0.7535081177709988	0.11009621399607639
20	0.374186714831074	0.03688418241886171	0.8343616661080064

	Var1	Var2	Var
1	0.6488819502093455	0.046288741031345504	0.20472832290217324
2	0.010905889635595356	0.6983555060532487	0.93298350850828
3	0.06642303695533736	0.3651093677271471	0.8272627957034728
4	0.9567533636029237	0.3024777928234499	0.09929915955881308
5	0.646690981531646	0.3725754415996787	0.6342997886044144
6	0.11248587118714015	0.15050782744925795	0.1327153585755645
7	0.2760209506672211	0.14732938279328955	0.7751941503856596
8	0.6516642063795697	0.2834013103457036	0.8692366891234362
9	0.05664246860321187	0.40495283364883794	0.039635617270926904
10	0.8427136165865521	0.49953074411487797	0.7904095314876494
11	0.9504984071553011	0.6588147837334961	0.43118828904466633
12	0.9646697763820897	0.5156272179795256	0.1376583132625555
13	0.9457754052519123	0.26071522632820776	0.6080803126880718
14	0.7899036826169576	0.5955204840509289	0.2550540600167448
15	0.8211604203482923	0.2924615242315285	0.4987340031883092
16	0.03416010848943718	0.2885798506061561	0.09403688346569439
17	0.09454448946400307	0.6181597973815087	0.5250899072103514
18	0.31492622391998415	0.6642598175011505	0.2655109248498748
19	0.12780989889368866	0.7535081177709988	0.11009621399607639
20	0.374186714831074	0.03688418241886171	0.8343616661080064

	Var1	Var2	Var	Cat1
1	0.6488819502093455	0.046288741031345504	0.20472832290217324	GroupB
2	0.010905889635595356	0.6983555060532487	0.93298350850828	GroupB
3	0.06642303695533736	0.3651093677271471	0.8272627957034728	GroupA
4	0.9567533636029237	0.3024777928234499	0.09929915955881308	GroupA
5	0.646690981531646	0.3725754415996787	0.6342997886044144	GroupA
6	0.11248587118714015	0.15050782744925795	0.1327153585755645	GroupA
7	0.2760209506672211	0.14732938279328955	0.7751941503856596	GroupB
8	0.6516642063795697	0.2834013103457036	0.8692366891234362	GroupB
9	0.05664246860321187	0.40495283364883794	0.039635617270926904	GroupB
10	0.8427136165865521	0.49953074411487797	0.7904095314876494	GroupB
11	0.9504984071553011	0.6588147837334961	0.43118828904466633	GroupA
12	0.9646697763820897	0.5156272179795256	0.1376583132625555	GroupB
13	0.9457754052519123	0.26071522632820776	0.6080803126880718	GroupA
14	0.7899036826169576	0.5955204840509289	0.2550540600167448	GroupB
15	0.8211604203482923	0.2924615242315285	0.4987340031883092	GroupA
16	0.03416010848943718	0.2885798506061561	0.09403688346569439	GroupB
17	0.09454448946400307	0.6181597973815087	0.5250899072103514	GroupB
18	0.31492622391998415	0.6642598175011505	0.2655109248498748	GroupA
19	0.12780989889368866	0.7535081177709988	0.11009621399607639	GroupA
20	0.374186714831074	0.03688418241886171	0.8343616661080064	GroupA

my net house

WAHEGURU….!

Category: Uncategorized

Things to Compelte this Week!!

hacker’s guide to Traefik Edge Router

RUN Timescale after HUGE efforts

Some Gitty GITs

Click – Command Line Interface Creation Kit , OH yeahh Python

Python Send SMS all over the things

Deploy Micro-service in Seconds Using Falcon

IBM hyper ledger Business Card creation and Code familiarity [inbetween]

Learning Dataframes in Julia

Week 4 – Working with Distributions and DataFrames.¶

Some Plugs-Plays with Julia Programing

Title: Week 3 – Fitting a Curve¶

Reading data from given Sample file¶

Using For loop to print data in array¶

Scatter plot¶

Creating parabfit() one-liner function¶

Ploting against Default values of a,b and c¶

Ploting using different range for parabfit()¶

Optimiseing Each Variable seprately¶

Optimising variable c¶

Optimising Variable a¶

Optimising for b¶

final Values of a,b and c¶

	A	B	C
1	1	21	41
2	2	22	42
3	3	23	43
4	4	24	44
5	5	25	45
6	6	26	46
7	7	27	47
8	8	28	48
9	9	29	49
10	10	30	50
11	11	31	51
12	12	32	52
13	13	33	53
14	14	34	54
15	15	35	55
16	16	36	56
17	17	37	57
18	18	38	58
19	19	39	59
20	20	40	60

	A	B	C
1	1	21	41
2	2	22	42
3	3	23	43
4	4	24	44
5	5	25	45
6	6	26	46
7	7	27	47
8	8	28	48
9	9	29	49
10	NA	30	50
11	11	31	51
12	12	32	52
13	13	33	53
14	14	34	54
15	15	NA	55
16	16	36	56
17	17	37	57
18	18	38	58
19	19	39	NA
20	20	40	60

	A	B	C
1	1	21	41
2	2	22	42
3	3	23	43
4	4	24	44
5	5	25	45
6	6	26	46
7	7	27	47
8	8	28	48
9	9	29	49
10	11	31	51
11	12	32	52
12	13	33	53
13	14	34	54
14	16	36	56
15	17	37	57
16	18	38	58
17	20	40	60

	A	B	C
1	1	21	41
2	2	22	42
3	3	23	43
4	4	24	44
5	5	25	45
6	6	26	46
7	7	27	47
8	8	28	48
9	9	29	49
10	10	30	50
11	11	31	51
12	12	32	52
13	13	33	53
14	14	34	54
15	15	35	55
16	16	36	56
17	17	37	57
18	18	38	58
19	19	39	59
20	20	40	60

	A	B	C
1	1	21	41
2	2	22	42
3	3	23	43
4	4	24	44
5	5	25	45
6	6	26	46
7	7	27	47
8	8	28	48
9	9	29	49
10	NA	30	50
11	11	31	51
12	12	32	52
13	13	33	53
14	14	34	54
15	15	NA	55
16	16	36	56
17	17	37	57
18	18	38	58
19	19	39	NA
20	20	40	60

	A	B	C
1	1	21	41
2	2	22	42
3	3	23	43
4	4	24	44
5	5	25	45
6	6	26	46
7	7	27	47
8	8	28	48
9	9	29	49
10	11	31	51
11	12	32	52
12	13	33	53
13	14	34	54
14	16	36	56
15	17	37	57
16	18	38	58
17	20	40	60

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Week 4 – Working with Distributions and DataFrames.¶

Share this:

Title: Week 3 – Fitting a Curve¶

Reading data from given Sample file¶

Using For loop to print data in array¶

Scatter plot¶

Creating parabfit() one-liner function¶

Ploting against Default values of a,b and c¶

Ploting using different range for parabfit()¶

Optimiseing Each Variable seprately¶

Optimising variable c¶

Optimising Variable a¶

Optimising for b¶

final Values of a,b and c¶

Share this:

	A	B	C
1	1	21	41
2	2	22	42
3	3	23	43
4	4	24	44
5	5	25	45
6	6	26	46
7	7	27	47
8	8	28	48
9	9	29	49
10	10	30	50
11	11	31	51
12	12	32	52
13	13	33	53
14	14	34	54
15	15	35	55
16	16	36	56
17	17	37	57
18	18	38	58
19	19	39	59
20	20	40	60

	A	B	C
1	1	21	41
2	2	22	42
3	3	23	43
4	4	24	44
5	5	25	45
6	6	26	46
7	7	27	47
8	8	28	48
9	9	29	49
10	NA	30	50
11	11	31	51
12	12	32	52
13	13	33	53
14	14	34	54
15	15	NA	55
16	16	36	56
17	17	37	57
18	18	38	58
19	19	39	NA
20	20	40	60

	A	B	C
1	1	21	41
2	2	22	42
3	3	23	43
4	4	24	44
5	5	25	45
6	6	26	46
7	7	27	47
8	8	28	48
9	9	29	49
10	11	31	51
11	12	32	52
12	13	33	53
13	14	34	54
14	16	36	56
15	17	37	57
16	18	38	58
17	20	40	60