Xtract Ltd Hitsaajankatu 22 00810 Helsinki FINLAND T +358 9 222 4122 F +358 9 222 4155 T-61.3050 27.11.2007 Xtract /

Slides:



Advertisements
Samankaltaiset esitykset
# epi7fin #episerver7 Petri Isola Lead Technical Sales Engineer Kävijän matka 1.
Advertisements

E-Science and Technology Infrastructure for biodiversity data and observatories ‘A web of sites and sensors taking the Earth’s pulse’ LifeWatch observatoriot:
Visuaalinen analytiikka Case: Alkon myymälöiden segmentointi
Sovellettu: Damelio, R The Basics of Process Mapping 1 Liite Harjoitustyöhön.
ComPa- projektin aloitusseminaari Muurmansk TOIMINTATUTKIMUS KEHITTÄMISEN VÄLINEENÄ KYÖSTI KURTAKKO PROFESSORI LAPIN YLIOPISTO.
Marko Mikkola Solution Sales Professional – Security and Management Microsoft Oy, Enterprise and Partner Group
Development Association SEPRA How to involve youth into strategic rural development work? Budapest, 8th November 2011 Euroopan maaseudun kehittämisen maatalousrahasto:
Uudet ulkomaiset yritykset v Lehdistötilaisuus Invest in Finland Tuomo Airaksinen
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Numerotiedot päivitetään kalvoihin helmikuussa, kun kaikki tilastoluvut vuodelta 2009 ovat tiedossa. Lisäksi kalvoja täydenne- tään uusien yhtiöiden esityksillä.
SoberIT Software Business and Engineering Institute HELSINKI UNIVERSITY OF TECHNOLOGY Kokemuksia väitöskirjan tekemisestä Marjo Kauppinen.
Mauri Vintola ja Juha Salminen Yhdistysfoorumi , Nokia.
Jotox on perustettu vuonna Jotox on perustettu vuona Jotox was founded / established in 1991.
HELSINKI UNIVERSITY OF TECHNOLOGY Hunch A Tool of an Intelligent Tester Juha Itkonen & Mika Mäntylä SoberIT TKK.
TUTL-sparrausprosessi syksy 2014
Suomen rautatieverkoston robustisuus (aihe-esittely)
Sosiaali- ja terveysalan tutkimus- ja kehittämiskeskus Tiedosta hyvinvointia Meri Koivusalo1 Social funds, social policy and social development.
Irmeli Sinkkonen TkL, tutkija
Customer projects integrated within curriculum at Media Engineering programmes Erkki Rämö EVTEK University of Applied sciences R&D Seminar Case:
International pages?. Comments yhteystiedot / contact –information ei resursseja kaiken kääntämiseen – linkit vastuuhenkilöille / no enough resources.
ENG Masters, part 2, Citing
EVTEK Mobile Application Laboratory. EVTEK Espoo-Vantaa Institute of Technology.
SoberIT Ohjelmistoliiketoiminnan ja –tuotannon instituutti TEKNILLINEN KORKEAKOULU T Käyttöliittymien ja käytettävyyden seminaari Kontekstiherkkyydestä.
Winter testing specialist Test World Ltd Independent, impartial and private testing company Founded in 1991 Located in Ivalo Specialised.
Agora Cnterin monitieteellisen tutkimuksen toimintaympäristö Jyväskylän yliopistossa AGORA CENTER MATEMAATTIS- LUONNONTIETEET INFORMAATIO- TEKNOLOGIA TALOUSTIETEET.
Miksi käydä koulutuksessa? Erkki Rämö. Omat koulutukseni kuukauden sisällä  Mindtrek € + matkat  Keksijänpäivä 6.110€  Mediapäivä €
Finský intensivní Titta Hänninen.  1. What is the capital of Finland? ◦ Mikä on Suomen pääkaupunki? ◦ Helsinki on Suomen pääkaupunki.  2.
© 2010 Ammattiosaamisen kehittämisyhdistys AMKE ry. IVETA International conference 2014 Helsinki 19 – 21 August 2014 St. Petersburg 22 – 23 August 2014.
By Learning for Integration ry. Immigration issues in Finland: Somalis  Until the 1980s Finland was very much a homogenous society with only a few foreigners.
A rural experience - launching community network Tuija Riukulehto, CEO, Verkko-osuuskunta Kuuskaista a rural experience - launching community network Tuija.
CAF eTool Rekisteröitymisen kautta maksutta käyttöön Kansallinen laatuhanke /TjV 1.
Computer based team play analysis in ice hockey coaching - an objective way to have feedback Jouko Lukkarila.
Yhteistyössä Suomen Olympiakomitea ja Adecco Finland.
Overvieuw of Vegetables and Fruit Import to Finland and Quality Requirements Inex as Purchasing Company Inex Quality Policy Quality Requirements Quality.
Perustietoja Luokka-asteet 1 – 9 = yhtenäiskoulu NYT: 650 oppilasta, koulu kasvaa voimakkaasti Oppilasmääräarvio: syksy 2013/750, syksy 2014/> opettajaa.
MICF: Developing a mobile application for ICF An international collaborative of the Functioning and Disability Reference Group of the World Health Organisation’s.
Toiminta-arkkitehtuurin palvelut Outi Tasala
Helsingin energiapäätös 2015 Jouni Tuomisto THL, Kuopio.
Sopimuksen tarkastelu
Major practical purposes Forms of research knowledge
Opinnäytetyö/Harjoittelu
Vocabulary for the text ”Socializing”
Tutkimusprosessin avaaminen
TULEVAISUUDEN HAASTEITA ILMANLAADUN MITTAAJILLE – kuntanäkökulma
Tips for a good entry Kaisa Sibelius Forum Virium Helsinki
Tervetuloa Tietojenkäsittelytieteen laitokselle
Miten kerrotaan tekemisestä?
What is Direct Carrier Billing?
Reflection in PLE Reflektio PLE:ssä
MyData – asiakas keskiöön
GLOBAL SUMMIT OF INNOVATION ECONOMY CREATORS MOSCOW
Risk analysis, risk attitudes
National Library of Finland and Digitized Collections for Researchers
Probability models and decision analysis
Probability models and decision analysis : introduction
Probability models and decision analysis : introduction
Hierarchical models Biotieteellinen tiedekunta / Henkilön nimi / Esityksen nimi
DIC and BMA in BUGS Biotieteellinen tiedekunta / Henkilön nimi / Esityksen nimi
Otsikko – Aiheen nimi tähän
Mikko Keränen Director RDI
The Icelandic model Bragi Gudbrandsson, General Director of the Government Agency for Child protection in Iceland.
X-ROAD ENVIRONMENTAL MONITORING
Aurinkolahti Comprehensive School
© Haaga-Helia StartUp School
Alustavaa tietoa – HUOM! Ei sido tilaajaa
MAR1LK Anna Hankimaa 5/10/2019.
Information for teachers
Lecture slides start on the next page.
Kari Systä Tampere University of Technology / Software Systems
New digital solutions for lifelong learners
Esityksen transkriptio:

Xtract Ltd Hitsaajankatu Helsinki FINLAND T F T Xtract / Juha Vesanto Data mining in practice

Company Confidential Intro My history Juha Vesanto M.Sc. in Engineering Physics 1997 Dr. Tech. in Information Science 2002 IDE research group Dissertation: "Data mining using the Self-Organising Map" Xtract history Founded in 2001 Main areas of operation: analytics and business consulting on data-based analytics software and integration services data Analytics specialities customer analytics segmentation, targeting social network analytics Personnel: in Helsinki, London, and sales representatives elsewhere This year forecasted revenue: >3.5 M€ Customers: Nokia, SanomaMagazines, Lehtipiste, Tradeka, Luottokunta, Vodafone,

Company Confidential BUSINESS DATA MINING Data mining in practice

Company Confidential Data mining in practice – not

Company Confidential Business data mining DATASYSTEM NEEDMODEL

Company Confidential Business modelling Liiketoiminta- kysymys "Keille markkinoin tuotettani?" Analytiikka- kysymys "p(osto | asiakas)" Business modelling Miten saan lisää liikevaihtoa? Miten saan lisää ostajia tuotteelle? Miten saan ostajia tehokkaasti?Mikä on oston arvo vs. kustannus? Mitkä muuta pitää ottaa huomioon? Markkinointikontaktien valinta?

Company Confidential Business and analytics viewpoints Business viewpoint modelling answers business needs aims at results deployment Analytics viewpoint data mining is about finding something interesting from data data mining starts with and revolves around data DATA Business understanding Data understanding Preparation ModelingEvaluation NEED Deployment

Company Confidential DATA MINING PROCESS Data mining in practice

Company Confidential CRISP-DM CRoss-Industry Standard Process for Data mining partners: Teradata, SPSS, DaimlerChrysler, OHRA + special interest group "51% of data miners use CRISP-DM methodology"

Company Confidential ct.fi 10 CRISP-DM Phases 1. Business understanding - business need - data mining target - project planning 3. Data preparation - data preprocessing - data enrichment - feature extraction 4. Modeling - model family selection - model optimization - model testing - model review 5. Evaluation - validation w.r.t. the need - results review 6. Deployment - taking results into use - model monitoring - updating the model 2. Data understanding - data collection - data review

Company Confidential PRACTICE Business modelling

Company Confidential Business & data understanding Business Ymmärrä asiakkaan toiminta Mikä on asiakkaan tavoite? Mitä asiakas oikeasti tarvitsee? Mitä toimenpiteitä asiakas on valmis / tottunut tekemään? Mitä muita tekijöitä täytyy ottaa huomioon? Selvitä stakeholders Kuka on oikeasti maksaja / tilaaja? Kuka oikeasti käyttäisi tuloksia? Selvitä ja aseta tavoite Mikä on tilaajan tavoite (lv, kate, pull, markkinaosuus)? Mitä tilaaja odottaa projektin lopputuloksena? Mitä tilaaja on ajatellut tekevänsä tuloksilla? Data Ymmärrä asiakkaan data Mitä dataa asiakkaalla on olemassa? Mistä se tulee, ja milloin sitä päivitetään? Mallinnus Miten data käännetään tuloksiksi? Mallin rakenne  luotettavuus, toistettavuus, tulosten taso Data  Ratkaisu Miten dataa voidaan käyttää ratkaisemaan asiakkaan ongelma? Miten asiakas käytännössä tekee analytiikan antamilla tuloksilla?

Company Confidential Data preparation: compensate for imperfect nature of the data In practice Practical difficulities arise from Measurements what can be measured? what has been measured? timing of measurements Data collection vague concepts  misunderstanding typing errors differences in system settings (e.g. time zones) In principle Analytical models aim at building a faithful representation of the real world Rule model if: x>3 & y<4 Linear model if: x+y < 7 Noise Bias Time delays outlier lost samples randomness event measurement effect time

Company Confidential Data preparation Read data from the data sources Clean the data Make relevant information more clearly visible Data enrichment Transform data to fit the assumptions of the modelling technique Usually 80% of the work (and typically 50-90% of the end result) Outlier removal Rotation  a single rule is sufficient

Company Confidential TSF Segmentat ion Project Project Manager: Priya Sawhney Reference : Thomas Pimenoff 15 Data enrichment: CLC classes 1. Tenant suburbs of younger singles and couples 4. Well educated, high income families 2. Singles in city apartments 3. Middle class in apartments 5. Countryside 6. Middle class in detached houses 7. Small income detached house areas 8. Retiree areas Young singles or couples without children in small apartments Well-educated, very involved in their work. Prefer the vitality of the large city to the tranquility of outer suburbs. Low income per households (due to large share of singles). Lower and middle income housing, occupied by students, junior administrative and service employees. Rental apartments in larger towns. High concentration of unemployment and people with low incomes. Residential neighborhoods on the outskirts of towns and cities, mainly private housing, Younger singles and couples in their 30ies. The educational, income and wealth figures are raising; low unemployment High income families in the more affluent suburbs, Professionals and wealthy business-people living in large and expensive owner-occupied houses. Two-income, two-car households. (Once) less expensive areas of large detached houses in outskirts of small and medium-sized towns Skilled manual and white-collar workers with their families. Low rate of unemployment. Unpretentious areas, where sensible and self-reliant people have worked hard to achieve a comfortable and independent lifestyle. Middle-aged households living in detached houses with small income. High unemployment rate, limited assets. Industry is or has been the most important employer. Areas located near the industrial centers of Finland. Retired and soon-to-be-retired singles and couples, who typically own their houses or apartments. High levels of discretionary expenditure (Low household income, but low expenditure on rent, mortgages and children) Rural areas where agriculture and industry (where industry still remains) remain a significant source of local employment. Considerable variance in the levels of affluence, from the old family farm areas to the quiet small villages of only retired farmers and workers.

Company Confidential Modelling TaskQuestionModelling Targeting"I want to market my product. I could send my ad to 1 million people, but I only except 2000 orders, so that's useless letters..." Predictive scoring model based on an earlier campaign using available Case: publishers, banks, retailers,... Segmentation"I have 1 million customers. They are a grey mass. Help?" Segment the customers into actionable groups. Case: just about anybody, eg. operators Pricing"I need to set the price for my product. What is the optimal price?" Price elasticity model log(dprice) ~ -a log(dvol) Case: just about anybody, eg. retailers Logistics"I have 500 retail outlets. How many products should I ship to each outlet to ensure optimal coverage?" Seasonal variation models Case: retailers, e.g. Lehtipiste Fraud detection"I need to identify fraudulent credit card transactions." Predictice scoring models Likelihood models

Company Confidential Month xx, Analytical evaluation (& validation) There are several ways to look at the data and the results. For the best results, it is best to check the data from all of these angles. 1.Statistics compare statistics of input and output data tables (starting with N=number of samples): do they match, are the deviations as intended by the preprocessing ? correlations result statistics: check score histograms, segment sizes model statistics 2.Cases / samples pick 1-5 sample data cases, and go through the processing by hand: are the results as intended ? 3.Common sense go through the results (cross-tabulations, deductions, histograms, decile profiles): do they make sense ? 4.Code review what is the processing script / pipeline / program?? go through the code and try to find logical inconsistencies etc.

Company Confidential Business evaluation Are the results practically usable? Review by end users Design and pilot field tests

Company Confidential Deployment LvlOperationActionBenefits 1Internal analytics Data mining activity Distribution of the results to organization Utilization of results Better understanding of the data for the data miner, and to the organization. Direct economic value through increased efficiency, decreased costs, or bigger revenue. 2Repeated analytics (backoffice) Monitoring and follow- ups Better understanding of business & data. Identification of further opportunities. Continuing increases in economic value. 3Scheduled analytics (batch) Planned, scheduled updates that tie in with business processes Further efficiency from regular usage No risk from applying outdated models 4Integrated analytics (online) Continuous updates to the model and scores Reoccuring benefits from the continuously applied model Minimized operational costs & risks

Xtract Ltd Hitsaajankatu Helsinki FINLAND T F Contact Details Juha Vesanto M Xtract Ltd Hitsaajankatu Helsinki FINLAND T F