How to get just one frequency of occurrence for items in different columns?

debugcn Published at Dev

Ram Sundar

I have a csv file that looks like the following:

ID L1 L2 L3 L4 X1 Y1 Z1
1  3   3  1  2  f f  x
1  3   3  3  2  g f  f
2  3   4  4  3  o p  q

I want to focus on Keeping (id, Li) where i = 1, 2, 3, 4 as key and frequency of occurrence as value. I want the output as a list [1, 3, 5] which represents the following:

<1, 3> appeared 5 (i.e. where ever 1 was there 3 appeared in L1 and/or L2 L3 L4)
<1, 1> appeared 1
<1, 2> appeared 2
<2, 3> appeared 2
<2, 4> appeared 2

If there is a new entry, it gets added and the old one gets counted.

Here is what I have tried:

import csv
import sys
from collections import defaultdict
from itertools import imap
from operator import  itemgetter
csv.field_size_limit(sys.maxsize)

d = defaultdict(lambda: defaultdict(lambda: defaultdict(int)))
    with open(myfile, 'r') as fi:
        for item in csv.DictReader(fi):
            for count in range(1, 5):
                    d[int(item['ID'])]['L'+str(count)][item['L'+str(count)]] += 1

But this is creating separate values for each L1-4 column wise. Like [1 (ID), 3 (L1), 2 (Frequency)] [1, 3(L2), 2]. How can the whole L1-4 be considered as one and based on ID and values of L1-4 frequency is counted?

Leo K

You have done well to state your problem with the phrase "keeping (id, Li) as key". In fact, you can use a little-known Python ability to do just that. The Python tuple object is a valid dict key. Therefore, this will work:

counts = defaultdict(int)

# accumulate counts indexed by a tuple (id,Li)
for item in csv.DictReader(fi):
    id = int(item["ID"]) # note 'int()' here and below assumes you actually want the values to be integers, drop it if you want them as strings from csv
    for l in ("L1", "L2", "L3", "L4"):
        counts[ (id, int(item[l])) ] += 1 

# now, all that's left is to convert each entry in counts to the list that we want
for key,item in counts.items():
    lst = list(key) + [item] # list() converts the tuple (id,Li) to [id,li] so we can append the count to that
    print (repr(lst))

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2021-08-12

Comments

0 comments

From Java

Related Related

Article

How to get just one frequency of occurrence for items in different columns?

How to get just one frequency of occurrence for items in different columns?

How to add columns with different levels and get their frequency in R

How to get multiple row items as different column in one sqlite command

Oracle how to grab columns for just one index?

How can I get different data types from the keyboard in just one line?

Taking the frequency of three different columns

How to "enforce" the frequency of one signal on the other; two signals with different frequencies

How to get commit history for just one branch?

How do I get the most frequent word in a Map and it's corresponding frequency of occurrence using Java 8 streams?

Get frequency across columns in R

How to get Spinner with required text color of specific items (or just title)

How to get just two items of a json like file

How to print columns with awk and in the same time edit just one column?

How would I modify this fgetcsv to check multiple columns (not just one!)

How to get the number of vectors in a list that contains at least one occurrence of an element?

Numpy: how to get the items that are different of 2 matrices?

How to tidy data from different columns into one

SQL - How to ORDER BY different columns in one query

How to include just one function from different C file

get count of items in two different tables into one resulting table

Frequency of unique values in different columns in pandas

Add multiple columns from different tables in SELECT statement using a GROUP BY with just one parameter

How to separate same numbers from one column in different different columns?

JS: click on different buttons should get printed different numbers on the screen, but it would print just one number

Frequency of multiple columns in one table (SPSS)

Frequency of multiple columns in one table (SPSS)

How to display columns from MySQL in one row with counters for each items

How to make list items of one UL with fixed height be displayed as columns?

How to get the columns from three different tables?

How get multiple SUM() for different columns with GROUP BY