2018/03/25

for Data Scientist

Useful links

  1. scikit-learn [link]

Data Visualization


  1. Information Visualization
Scientific Visualization : 일상의 내용을 visualization
Information Visualization : abstract data visualization

  1. Definition
Provide tools that present data in a way to help understand and gain insight from it

  1. InfoVis is interdisciplinary
graphics
cognitive psychology
HCI : using users and tasks to guide design and evaluation

  1. Expressiveness and Effectiveness
Expressiveness : Vis idiom should express all of, and only, the information in the dataset attributes
Effectiveness : Most important attributes should be encoded with the most effective channels --> ranking of channels
correctness, accuracy, ant truth

  1. Stevens' Power Law
감각 자극과 감각 경험 간의 관계가 지수 함수로 표시된다는 공식화를 말한다. 감각의 종류에 따라 지수가 다르다
   , p : perceived magnitude, a : actual magnitude,
              length judgement  1
              area judgement     1
              volume judgement 1

  1. Relative vs. Absolute Judgements
perceptual system mostly operates with relative judgements, not absolute
Weber's Law : the perceived change in stimuli is proportional to the initial stimuli

  1. Preattentive Processing : cognitive operations done preattentively, without the need for focused attention
"pop out" of a display : easily detected regardless of the number of distractors
target detection, boundry detection, region tracking, counting

  1. Design Guidelines/Principles
Visual Information Seeking Mantra
Overview first, zoom and filter, details on demand

  1. Measuring Misrepresentation



  1. Design Principles
avoid chartjunk
use small multiples
utilize narratives of space and time

  1. Visualization Analysis and Design
    1. Definitions and Motivations
computer-based visualization systems provide visual representations of datasets designed to help people carry out (some) tasks more effectively
  1. Big picture
    1. VIS is suitable when there is a need to argument human capabilites
    2. design visual representations to help people perform task more effectively
    1. design space is HUGE!
    2. resource limitations
    1. analysis instance
WHY the user needs it, WHAT data is shown, HOW the idiom is designed
  1. Transitional use
gain a clear understanding of user's task --> purely computational solution --> monitoring automatic system is doing right
  1. Long-term use
    1. Exploratory analysis
    2. Vis tool for presentation
  2. Why use interactivity?
    1. impossible to show everything at once
    2. handling complexity and volume
  3. Visualization Idioms : distinct approach to creating and manipluating visual representations
a tool that serves well for one task can be poorly to another

  1. Analysis : Four Levels for Validation
so many possible ways.
Four Levels of vis design
  output of upstream level --> input to downstream level
    : upstream errors inevitably cascade down

Four kinds of threats to validate
  1. wrong problem : they don't do that
  2. wrong abstraction : showing wrong thing
  3. wrong encoding/interaction technique : way you show don't work
  4. wrong algorithm : code is too slow
--> proto-typing 통해서 미리 validation 있다.

  1. What : Data Abstraction
Datasets 
@ Data Types 
Items + Attributes 
@ Data and Dataset Types 
Tables 
Networks & 
Trees 
Links 
Fields 
Grids 
Åt-ttibJte' 
Networks 
Trees 
What? 
Positions 
Geometry 
Grids 
Clusters, 
Sets, Lists 
@ Dataset Types 
Tables 
Aru 
Multidimensional Fable 
+ Geometry {Spatial) 
fields (Continuous) 
Attributes 
Attribute Types 
Categorical 
Ordered 
-O Otdina/ 
Qoantitative 
Ordering Direction 
Sequential 
+ Diverging 
Cyclic

psexeo @

  • semantics : real-world meaning
  • attribute, item, link, grid, position
  • set : unordered group of items
  • list : ordered group of items
  • cluster : grouping based on attribute similarity
  • path : ordered set + links connecting nodes

  • data abstraction : domain-specific to GENERIC

  1. Why: Task Abstraction
    • task abstraction should guide data abstraction
    • Analyze > Search > Query
    • Discover, Present, Enjoy, Annotate, Record, Derive(경험이 많을 수록 사용 -> visualization idiom 사용의 폭이 넓어짐)

  1. Marks and Channels
    • idiom can be break down into Marks and Channels
    • Mark
      • basic graphical element : point(0D), line(1D), area(2D), volumn(3D)
      • Items / Nodes
      • Links
@ Containment 
@ Connection
  • Channel : a way to control the apperance of marks
@ Position 
Horizontal 
@ Shape 
@ Size 
Length 
Vertical 
Area 
Both 
@ Color 
@ Tilt 
volume
  • Expressiveness types and effectiveness rankings(1,2 순위는 외울것)

@ MaOnitude Channels&deted Attributes 
Position on common scale 
Position on unaligned scale 
Length (ID size) 
Tllt/angle 
@-identity Channels: Categorical Attributes 
Spatial region 
Color hue 
Motion 
Shape

 * Better encoding ?
Car 
Accord 
AMC Pacer 
Audi 5000 
BMW 320i 
Champ 
Chev Nova 
Civic 
Datsun 210 
Datsun 810 
Deville 
Le Car 
Linc Cont 
Horizon 
Mustang 
Peugeot 
Saab goo 
Subaru 
Volvo 260 
VW Dasher 
USA Japan Germany 
Car nationality for 1979 
France 
Sweden 
Nation
Bar size 인하여 국가 간의 우선순위 혹은 우열을 내제적으로 표현하고 있어 불필요한 정보를 전달하고 있.
Length를 변경하는데 Stevens' power law 따라 effectiveness ranking 높은 position으로 변경하여 이를 제거한다.
Car 
A xord 
AMC Pacer 
Audi 5000 
BMW 320i 
Champ 
Chev Nova 
Civic 
Datsun 210 
Datsun 810 
Deville 
Le Car 
Linc Cont 
Horizon 
Mustang 
Peugeot 
sub 900 
Subaru 
Volvo 260 
V-vv Dasher 
USA 
Japan Germany 
France 
Sweden 
Nation 
Car nationality for 1979

  • position dominates the user's mental model
  1. Preattentive processing
    1. cognitive operation done preattentively, without the need for focused attention
    1. popout, segmentation effects
      1. many channels : tilt, size, shape, proximity, shadow direction, but not parallel line.

  1. Gestalt Psychology
인간은 자신이 본 것을 조직화하려는 기본 성향을 가지고 있으며, 전체는 부분의 합 이상이라는 점을 강조하는 심리학.
  1. proximity : same spatial region
  2. similarity : same value as other categorical channels (color hue, motion, shape)
  3. connectedness : 모든 channel을 압도하는 가장 강력한 표현
  4. continuity : 물리적으로 가장 단순하고 이해하기 쉬운 방향으로 이해한다.
  5. common fate : things moving together
  1. luminance contrast : Perception of color and luminance is contextual, based on relative judgements

  1. Perception and Visual Patterns
    1. Gestalt Principles
      1. Grouping : avoid explicit grouping
        1. Proximity
        2. Similarity
        3. Continuity
        4. Common Fate

  1. Perception of Forms
    1. Closure : form complete, closed figures to increase regularity
    2. Area/Figure and Ground/Relative size : smaller one as figure, larger one as ground
    3. Symmetry : symmetry images are perceived collectively, even in spite of distance

  1. Fixation-Saccade Cycle
    1. Fixation : brief stationary period when detail information is acquired
    2. saccade : flicking rapidly to a new location
  2. postattentive amnesia

  1. Rule of Thumbs
    1. Overview
      1. no unjustified 2D / 3D
      2. eyes beats memory
      3. resolution over immersion
      4. function first, form next

  1. Interaction Design and Design Principles
    1. fundamental design goals is provide the right affordances, good mapping
      1. right affordance : perceived and actual properies
      2. right conceptual model
      3. good mapping : a relationship between controls and their movements or effects
      4. casuality : iterpesentation of feedback

D3를 이해하기 위한 도움 정보


1) 셀렉션 작동 원리 [link]