Skip to main content

R programming language introduction


 

R is a programming language and open-source software environment that is widely used for statistical computing, data analysis, and graphics. It was developed by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and it was first released in 1995. R provides a comprehensive set of tools for manipulating, visualizing, and modelling data, making it a favourite among statisticians, data scientists, researchers, and analysts.

Key Features of R:

1. Data Manipulation: R offers powerful data manipulation capabilities, allowing you to clean, transform, and preprocess data easily. Packages like `dplyr` and `tidyr` provide functions for efficient data wrangling.

2. Statistical Analysis: R provides an extensive range of statistical functions and libraries for performing various types of analyses, including regression, hypothesis testing, ANOVA, and more.

3. Visualization: R is known for its exceptional visualization capabilities. The `ggplot2` package is widely used for creating high-quality, customizable graphs and plots.

4. Machine Learning: R has a growing ecosystem of machine learning libraries, such as `caret`, `randomForest`, and `xgboost`, enabling you to build and train predictive models.

5. Packages and Libraries: R's strength lies in its vast collection of packages and libraries contributed by the R community. These packages cover a wide range of domains, from bioinformatics to finance.

6. Data Import/Export: R can handle various data formats, including CSV, Excel, JSON, and databases. It also provides functions for reading and writing data.

7. Reproducibility: R promotes reproducible research by allowing users to create scripts that document and automate their data analysis and visualization processes.

8. Interactive Environment: R provides an interactive environment through the R Console or integrated development environments (IDEs) like RStudio, making it user-friendly and efficient for data exploration.

Where and When to Use R:

1. Data Analysis and Exploration: R is ideal for exploring and analyzing data to gain insights, identify trends, and understand underlying patterns.

2. Statistical Modeling: When you need to perform complex statistical analyses, hypothesis testing, and create statistical models, R is a suitable choice.

3. Data Visualization: If you want to create publication-quality graphs and visualizations, R's `ggplot2` package allows you to customize and control every aspect of your visualizations.

4. Academic and Research Projects: R is extensively used in academia and research for conducting experiments, analyzing data, and presenting findings.

5. Machine Learning and Predictive Modeling: R's machine learning libraries enable you to build predictive models for classification, regression, clustering, and more.

6. Biostatistics and Healthcare: R is popular in the medical and healthcare fields for analyzing clinical trial data, epidemiology, and bioinformatics.

7. Financial Analysis: R is used for quantitative analysis, risk management, and portfolio optimization in the finance industry.

8. Data Science: R plays a significant role in data science projects, where it's used for data preprocessing, feature engineering, modeling, and visualization.

In summary, R is a versatile and powerful tool for data analysis, statistical computing, and visualization. Its wide range of packages and strong statistical capabilities make it a popular choice for individuals and organizations working with data.

To install R and RStudio on your computer, follow these steps:


Installing R:

1. Windows:

   - Visit the CRAN (Comprehensive R Archive Network) website for Windows: https://cran.r-project.org/bin/windows/base/

   - Click on the "Download R for Windows" link.

   - Choose a CRAN mirror location (usually the first option).

   - Download the executable installer file (e.g., R-4.1.1-win.exe).

   - Run the installer and follow the installation prompts.

2. macOS:

   - Visit the CRAN website for macOS: https://cran.r-project.org/bin/macosx/

   - Download the latest version of R for macOS.

   - Open the downloaded disk image (`.pkg` file).

   - Follow the installation prompts.

3. Linux:

   - Depending on your Linux distribution, you can usually install R using your package manager. For example, on Ubuntu, you can open the terminal and run:

     ```

     sudo apt-get update

     sudo apt-get install r-base

     ```

Installing RStudio:

1. Windows, macOS, Linux:

   - Visit the RStudio download page: https://www.rstudio.com/products/rstudio/download/

   - Scroll down to the "Installers for Supported Platforms" section.

   - Choose the appropriate installer for your operating system (RStudio Desktop Open Source Edition).

   - Download and run the installer.

   - Follow the installation prompts.

Running R and RStudio:

1. R:

   - After installing R, you can run it by opening the R console. On Windows, this is typically called "R GUI" or "Rterm" in the Start Menu. On macOS and Linux, you can open the terminal and type `R` to start the R console.

2. RStudio:

   - After installing RStudio, you can run it by searching for "RStudio" in your application launcher (Windows) or using Spotlight search (macOS). On Linux, you can use the terminal to launch RStudio by typing `rstudio`.

Once you have both R and RStudio installed, you can start using R for data analysis, statistical modeling, and more. RStudio provides a user-friendly interface and enhanced features for working with R scripts, projects, and visualizations.

Remember to periodically check for updates to both R and RStudio to ensure you're using the latest versions with the most up-to-date features and bug fixes.


Some example code:

print("R")

x <- 5

y <- 10

total <- x + y

plot(1:total)


var1 = "machine"

var2 = "leaerning"

cat(var1, " ", var2)


# data types

num = 20

intnum = 20L

print(num)

class(num)

print(intnum)

class(intnum)


logic <- TRUE

class(logic)


char <- "a"

class(char)


# converting data types

a <- 25L

class(a)


num1 <- as.numeric(a)

class(num1)

print(num1)


b <- '25'

num2 <- as.numeric(b)

class(num2)

print(num2)


num3 <- as.numeric(TRUE)

num3

num4 <- as.numeric(FALSE)

num4


num5 <- as.integer(45.564)

num5


num6 <- as.integer("a")

num6


log1 <- as.logical(56L)

log1


com1 <- as.complex(234.09)

com1


chr1 <- as.character(345)

chr1


x

if (x %% 2 == 0) {

  print("The number is even")

} else {

  print("The number is odd")

}


# operators

a = 5

b = 10

print(a+b)

print(a-b)

print(a*b)

print(b/a)

print(b%%a)

print(b%/%b)


c1 <- c(10, 20, 30)

c1


x <- 2

while(x < 6) {

  print(x)

  x <- x + 1

}


repeat {

  print(x)

  x <- x + 1

  if(x > 10) {

    break

  }

}


# next and break





# reading from user input

age <- readline("what is your age?")

49

age

nam <- readline("what is your name?")

nam


print(paste("hello my name is:", nam, " and age is ", age))


# function

new_func <- function() {

  for(i in 1:5) {

    print(i + 2)

  }

}


new_func()


func2 <- function(x, y) {

  res <- x * y

  print(paste("x: ", x, " y:", y))

  print(paste("res: ", res))

}


func2(5, 8)


# substring

a <- "dhiraj patra"

res <- substr(a, 3, 7) # all inclusive

print(toupper(res))


# regular expression

s2 <- c("abc", "abcde", "abcdef")

pat <- '^abc'

print(grep(pat, s2))


# list in memory

ls()

# with pattern

ls(pattern = "n")

ls.str()

n <- 0.5

n1 <- 10

n2 <- 100

nam <- "Carmen"


# dataframe

M <- data.frame(n1, n2, n)

ls.str(pat = "M")


getwd()

setwd("/Users/Admin/Desktop/personal/courses/EICT_academy_IIT_Guhati/R")

my_data <- read.csv("sales_Data.csv")


x <- 1:30

print(x[1:30])


# sequence 

y <- seq(1, 5, 0.5)

y


# take the data from keyboard

z <- scan()

z


# gausian sequence

g <- rnorm(n, mean = 0, sd = 1)

g


# factors

factor(1:3)

factor(1:10, exclude = 5)


# matrix

matrix(1:6, 2, 3)

matrix(1:6, 2, 3, byrow = TRUE)


# dimension

x <- 1:15

x

dim(x)

dim(x) <- c(5, 3)

x


x <- 1:4; n <- 10; M <- c(10, 35); y <- 2:4

data.frame(x, n)

data.frame(x, M)


# list

L1 <- list(x, y); L2 <- list(A=x, B=y)

L1

L2


# expression

x <- 3; y <- 2.5; z <- 1

exp1 <- expression(x / (y + exp(z)))

exp1

eval(expr = exp1)


# factor

fac2 <- factor(c("Male", "Female"))

fac2

as.numeric(fac2)


fac <- factor(c(1, 10))

fac

as.numeric(fac)

as.numeric(as.character(fac))


# logicalcomparison

x <- 0.5

0 < x < 1

x <- 1:3; y <- 1:3

x == y

x <- 1:10

x[x >= 5] <- 20

x


# object

names(x) <- c("a", "b", "c")

x


# predefined functions

sum(x)

prod(x)

max(x)

min(x)

mean(x)

median(x)


# graphics

layout(matrix(1:4, 2, 2))

mat <- matrix(1:4, 2, 2)

mat

layout(mat)

layout.show(4)


m <- matrix(c(1:3, 3), 2, 2)

layout(m)

layout.show(3)


plot(x)

boxplot(x)

pie(x)

hist(x)

barplot(x)


x <- rnorm(10)

y <- rnorm(10)

plot(x, y)


# loop

for (i in 1:length(x)) {

  y[i] = i * 2

}

y


i <- 0

while (i < 10) {

  print(y[i])

  i + 1

}













Comments

Popular posts from this blog

Financial Engineering

Financial Engineering: Key Concepts Financial engineering is a multidisciplinary field that combines financial theory, mathematics, and computer science to design and develop innovative financial products and solutions. Here's an in-depth look at the key concepts you mentioned: 1. Statistical Analysis Statistical analysis is a crucial component of financial engineering. It involves using statistical techniques to analyze and interpret financial data, such as: Hypothesis testing : to validate assumptions about financial data Regression analysis : to model relationships between variables Time series analysis : to forecast future values based on historical data Probability distributions : to model and analyze risk Statistical analysis helps financial engineers to identify trends, patterns, and correlations in financial data, which informs decision-making and risk management. 2. Machine Learning Machine learning is a subset of artificial intelligence that involves training algorithms t...

Wholesale Customer Solution with Magento Commerce

The client want to have a shop where regular customers to be able to see products with their retail price, while Wholesale partners to see the prices with ? discount. The extra condition: retail and wholesale prices hasn’t mathematical dependency. So, a product could be $100 for retail and $50 for whole sale and another one could be $60 retail and $50 wholesale. And of course retail users should not be able to see wholesale prices at all. Basically, I will explain what I did step-by-step, but in order to understand what I mean, you should be familiar with the basics of Magento. 1. Creating two magento websites, stores and views (Magento meaning of website of course) It’s done from from System->Manage Stores. The result is: Website | Store | View ———————————————— Retail->Retail->Default Wholesale->Wholesale->Default Both sites using the same category/product tree 2. Setting the price scope in System->Configuration->Catalog->Catalog->Price set drop-down to...

How to Prepare for AI Driven Career

  Introduction We are all living in our "ChatGPT moment" now. It happened when I asked ChatGPT to plan a 10-day holiday in rural India. Within seconds, I had a detailed list of activities and places to explore. The speed and usefulness of the response left me stunned, and I realized instantly that life would never be the same again. ChatGPT felt like a bombshell—years of hype about Artificial Intelligence had finally materialized into something tangible and accessible. Suddenly, AI wasn’t just theoretical; it was writing limericks, crafting decent marketing content, and even generating code. The world is still adjusting to this rapid shift. We’re in the middle of a technological revolution—one so fast and transformative that it’s hard to fully comprehend. This revolution brings both exciting opportunities and inevitable challenges. On the one hand, AI is enabling remarkable breakthroughs. It can detect anomalies in MRI scans that even seasoned doctors might miss. It can trans...