Cyber Analytics and Intelligence: Decision Trees

  

DECISION TREES for Risk Assessment

One of the great advantages of decision trees is their interpretability. The rules learnt for classification are easy for a person to follow, unlike the opaque “black box” of many other methods, such as neural networks. We demonstrate the utility of this using a German credit data set. You can read a description of this dataset at the UCI site. The task is to predict whether a loan approval is good or bad credit risk based on 20 attributes. We’ve simplified the data set somewhat, particularly making attribute names and values more meaningful.

1. Download the credit_Dataset.arff dataset and load it to Weka.

2. (5 Points) When presented with a dataset, it is usually a good idea to visualise it first. Go to the Visualise tab. Click on any of the scatter plots to open a new window which shows the scatter plot for two selected attributes. Try visualising a scatter plot of age and duration. Do you notice anything unusual? You can click on any data point to display all it’s values.

3. (5 Points) In the previous point you should have found a data point, which seems to be corrupted, as some of its values are nonsensical. Even a single point like this can significantly affect the performance of a classifier. How do you think it would affect Decision trees? A good way to check this is to test the performance of each classifier before and after removing this datapoint.

4. (10 Points) To remove this instance from the dataset we will use a filter. We want to remove all instances, where the age of an applicant is lower than 0 years, as this suggests that the instance is corrupted. In the Preprocess tab click on Choose in the Filter pane. Select filters > unsupervised > instance > RemoveWithValues. Click on the text of this filter to change the parameters. Set the attribute index to 13 (Age) and set the split point at 0. Click Ok to set the parameters and Apply to apply the filter to the data. Visualise the data again to verify that the invalid data point was removed.

5. (20 Points) On the Classify tab, select the Percentage split test option and change its value to 90%. This way, we will train the classifiers using 90% of the training data and evaluate their performance on the remaining 10%. First, train a decision tree classifier with default options. Select classifiers > trees > J48 and click Start. J48 is the Weka implementation of the C4.5 algorithm, which uses the normalized information gain criterion to build a decision tree for classification.

6. (20 Points) After training the classifier, the full decision tree is output for your perusal; you may need to scroll up for this. The tree may also be viewed in graphical form by right-clicking in the Result list and selecting Visualize tree; unfortunately this format is very cluttered for large trees. Such a tree accentuates one of the strengths of decision tree algorithms: they produce classifiers which are understandable to humans. This can be an important asset in real life applications (people are seldom prepared to do what a computer program tells them if there is no clear explanation). Observe the output of the classifier and try to answer the following questions:

o How would you assess the performance of the classifier? Is the Percentage of Correctly Classified Instances a sufficient measure in this case? Why? Hint: check the number of good and bad cases in the test sample, using the confusion matrix. Each column of the matrix represents the instances in a predicted class, while each row represents the instances in an actual class. For example let us define an experiment from P positive instances and N negative instances. The four outcomes can be formulated in a 2 by 2 contingency table or confusion matrix. One benefit of a confusion matrix is that it is easy to see if the system is confusing two classes (i.e. commonly mislabeling one as another).

o Looking at the decision tree itself, are the rules it applies sensible? Are there any branches which appear absurd? At what depth of the tree? What does this suggest?
Hint: Check the rules applied after following the paths: (a) CheckingAccount = <0, Foreign = yes, Duration >11, Job = skilled, OtherDebtors = none, Duration <= 30 and (b) CheckingAccount = <0, Foreign = yes, Duration >11, Job = unskilled.

o How does the decision tree deal with classification in the case where there are zero instances in the training set corresponding to that particular path in the tree (e.g. those leaf nodes that have (0:0))?

7. (20 Points) Now, explore the effect of the confidenceFactor option. You can find this by clicking on the Classifer name (to the right of the Choose button on the Classify tab). On the Classifier options window, click on the More button to find out what the confidence factor controls. Try the values 0.1, 0.2, 0.3 and 0.5. What is the performance of the classifier at each case? Did you expect this given your observations in the previous questions? Why do you think this happens?

8. (20 Points) Suppose that it is worse to classify a customer as good when they are bad, than it is to classify a customer as bad when they are good. Which value would you pick for the confidence factor? Which performance measure would you base your decision on?

9.  (20 Points)Finally we will create a random decision forest and compare the performance of this classifier to that of the decision tree and the decision stump. The random decision forest is an ensemble classifier that consists of many decision trees and outputs the class that is the mode of the class’s output by individual trees. Again set the test option Percentage split to 90%. Select classifiers > trees > RandomForest and hit Start. Again, observe the output. How high can you get the performance of the classifier by changing the number of trees (numTrees) parameter? How does the random decision forest compare performance wise to the decision tree and decision stump?

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now

Discussion 2- Strategy Applied in Project management

Read and reflect on the assigned readings for the week. Then post what you thought was the most important concept(s), method(s), term(s), and/or any other thing that you felt was worthy of your understanding in each assigned textbook chapter.Your initial post should be based upon the assigned reading for the week, so the textbook should be a source listed in your reference section and cited within the body of the text. Other sources are not required but feel free to use them if they aid in your discussion.

Also, provide a graduate-level response to each of the following questions:

  1. The culture of the organization can impact the effectiveness of different project management structures. Organizational cultures that do not encourage teamwork, collaboration, and cross-functional integration need a stronger project management structure (i.e., project team, project matrix) to be successful. Conversely, a functional matrix can be effective in an organization in which the culture of the organization is conducive to project management.
  2. You work for LL Company, which manufactures high-end optical scopes for hunting rifles. LL Company has been the market leader for the past 20 years and has decided to diversify by applying its technology to develop a top-quality binocular. What kind of project management structure would you recommend they use for this project? What information would you like to have to make this recommendation, and why?

Chapters-

Chapter. 3 Organization: Structure and Culture

Chapter. 4 Defining the Project

Text

Title: Project Management: The Managerial Process 

ISBN: 9781260238860 

Authors: Clifford F. Gray, Erik W. Larson 

Publisher: McGraw-Hill Education 

Publication Date: 2020-01-09

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now

Discussion 2- Project HR & Stakeholder MGnt

Read and reflect on the assigned readings for the week. Then post what you thought was the most important concept(s), method(s), term(s), and/or any other thing that you felt was worthy of your understanding in each assigned textbook chapter.Your initial post should be based upon the assigned reading for the week, so the textbook should be a source listed in your reference section and cited within the body of the text. Other sources are not required but feel free to use them if they aid in your discussion.

Also, provide a graduate-level response to each of the following questions:

  1. What are the elements within a Stakeholder Management Plan? Why is it important to have a Stakeholder Management Plan? 

Chapter- 

Chapter 2 – Categorizing Stakeholders

Text

Title: Managing Project Stakeholders 

ISBN: 9781118504277 

Authors: Tres Roeder 

Publisher: John Wiley & Sons 

Publication Date: 2013-04-22

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now

Discussion 2- Project Risk & Quality MNGT

Read and reflect on the assigned readings for the week. Then post what you thought was the most important concept(s), method(s), term(s), and/or any other thing that you felt was worthy of your understanding in each assigned textbook chapter.Your initial post should be based upon the assigned reading for the week, so the textbook should be a source listed in your reference section and cited within the body of the text. Other sources are not required but feel free to use them if they aid in your discussion.Also, provide a graduate-level response to each of the following questions:

  1. What factors make a project high risk?
  2. What are the three types of project risk?
  3. How do you write a good project risk?

Chapter-

Chapter 3: Projects and Project Stakeholders

Text

Textbook:

Title: Managing Project Risks 

ISBN: 9781119489733 

Authors: Peter J. Edwards, Paulo Vaz Serra, Michael Edwards 

Publisher: John Wiley & Sons 

Publication Date: 2019-08-13

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now

Discussion 2- Executing the Project

Before we execute a project, we must baseline our project.  Search the Internet and ascertain what we mean by “Baselining” our project.  And specifically, which items are baselined?  Stated another way, what items are in the basline? Describe them.

Text

Title: Project Management 

ISBN: 9780134730332 

Authors: Pinto 

Publisher: Pearson 

Edition: 5TH 19

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now

Data mining

Q1. Why are the original/raw data not readily usable by analytics tasks? What are the main data preprocessing steps? List and explain their importance in analytics.

Q2. What are the privacy issues with data mining? Do you think they are substantiated? 

Each response should be 300 words. There must be at least one APA formatted reference to each question (and APA in-text citation) to support the thoughts in the post.  Do not use direct quotes, rather rephrase the author’s words, and continue to use in-text citations.

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now

ITS-530 – Analyzing & Visualizing Data – Paper

Homework:

Review The Power of Good Design- https://www.vitsoe.com/gb/about/good-design   and select three of the ten principles noted for good design. Next in R, utilize these three principles in a problem that you will solve. First, note the problem to solve, the dataset (where the information was pulled from), and what methods you are going to take to solve the problem.  Ensure the problem is simple enough to complete within a two-page document. For example, I need to purchase a house and want to know what my options are given x amount of dollars and x location based on a sample of data from Zillow within each location. 

Ensure there is data visualization in the homework and note how it relates to the three principles selected.

Parts of this assignment:

Part 1:  Review the 3 Principles that you are going to use.

Part 2:  Discussion of the Problem.

Part 3:  The dataset (where you got the data)

Part 4:  Explain how you can solve the problem

Part 5:  Make sure you have a data visual (you can create the visual with R-Language)

Note: plagiarism check required, APA7 format, include References, within 8hrs

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now

Discussion 300 words

 Why are security policies important? 

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now

Using P5 and JavaScript, create a simple smiley face

 

The purpose of this assignment is to get you started writing simple lines of code in JavaScript. It will be done making something artistic – not always the end use case of code, but coding can be a way to express yourself! This assignment will have you writing commands, or lines of code that do something, and variables, which are lines of code that help you create and store data. As a first assignment, this is deceptively easy. Future assignments will require more effort and time.

It is expected that you do this work without referencing material beyond those employed in the class. (i.e., not using stack overflow or other resources to find exact solutions to the problems posed).

In this assignment and each future assignment you will be asked to create an algorithm (or plan) for your code, which will be shown in your comments in the code itself. You may also wind up doing some of your thinking on paper before turning it into a plan you can create code for. 
Additionally, you will be required to write a brief reflection on your experience.

 Submit a word document that includes :

1-your code 

2- A reflection on the process you employed to get to this outcome (how did you plan your code, what resources did you use, where did you struggle, what were your takeaways / insights?) 

3-The planning material you used for your program (pseudo code, algorithm development, drawings/sketches/flowcharts)

View Rubric
 

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now

Week 7 DQ IT

Q.  Why are security policies important?

Note: 300 words with intext citations and 2 references needed.

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now