Loading…
Loading…
Trainer-curated Q&A from EmergenTeck experts. Updated for 2026 hiring season.
Click any question to reveal the expert answer. Study at your own pace.
Ans: Any Python file is a module , its name being the file’s base name without the .py extension. import my_module A package is a collection of Python modules: while a module is a single Python file, a package is a directory of Python modules containing an additional init .py file, to distinguish a package from a directory that just happens to contain a bunch of Python scripts. Packages can be nested to any depth, provided that the corresponding directories contain their own init .py file. Packages are modules too. They are just packaged up differently; they are formed by the combination of a directory plus init .py file. They are modules that can contain other modules. from my_package.timing.danger.internets import function_of_love 329. What is GIL? Ans: Python has a construct called the Global Interpreter Lock (GIL). The GIL makes sure that only one of your ‘threads’ can execute at any
Ans: For performing Static Analysis, PyChecker is a tool that detects the bugs in source code and warns the programmer about the style and complexity. Pylint is another tool that authenticates whether the module meets the coding standard. 181.How Does Python Handle Memory Management? Ans: Python uses private heaps to maintain its memory. So the heap holds all the Python objects and the data structures. This area is only accessible to the Python interpreter; programmers can’t use it. And it’s the Python memory manager that handles the Private heap. It does the required allocation of the memory for Python objects. Python employs a built-in garbage collector, which salvages all the unused memory and offloads it to the heap space. 182.What Are The Principal Differences Between The Lambda And Def? Ans: Lambda Vs. Def. Def can hold multiple expressions while lambda is a uni-expression function
Ans: Memory is managed by the private heap space. All objects and data structures are located in a private heap, and the programmer has no access to it. Only the interpreter has access. Python memory manager allocates heap space for objects. The programmer is given access to some tools for coding by the core API. The inbuilt garbage collector recycles the unused memory and frees up the memory to make it available for the heap space.
Ans: Similar to PERL and PHP, Python is processed by the interpreter at runtime. Python supports Object-Oriented style of programming, which encapsulates code within objects. Derived from other languages, such as ABC, C, C++, Modula-3, SmallTalk, Algol-68, Unix shell, and other scripting languages. Python is copyrighted, and its source code is available under the GNU General Public License (GPL). Supports the development of many applications, from text processing to games. Works for scripting, embedded code and compiled the code. Detailed
Ans: The built-in method which decides the types of the variable at the program runtime is known as type() in Python. When a single argument is passed through it, then it returns given object type. When 3 arguments pass through this, then it returns a new object type.
Ans: Modules can be imported using the import keyword. You can import modules in three ways- Example: 123 import array #importing using the original module name import array as arr # importing using an alias name from array import * #imports everything present in the array module 163. Explain Inheritance in Python with an example. Ans: Inheritance allows One class to gain all the members(say attributes and methods) of another class. Inheritance provides code reusability, makes it easier to create and maintain an application. The class from which we are inheriting is called super-class and the class that is inherited is called a derived / child class. They are different types of inheritance supported by Python: Single Inheritance – where a derived class acquires the members of a single super class. Multi-level inheritance – a derived class d1 in inherited from base class base1, and d2 are
Ans: To install Python on Windows, follow the below steps: Install python from this link: https://www.python.org/downloads/ After this, install it on your PC. Look for the location where PYTHON has been installed on your PC using the following command on your command prompt: cmd python. Then go to advanced system settings and add a new variable and name it as PYTHON_NAME and paste the copied path. Look for the path variable, select its value and select ‘edit’. Add a semicolon towards the end of the value if it’s not present and then type %PYTHON_HOME% 123. Is indentation required in python? Ans: Indentation is necessary for Python. It specifies a block of code. All code within loops, classes, functions, etc is specified within an indented block. It is usually done using four space characters. If your code is not indented necessarily, it will not execute accurately and will throw errors a
LIST vs TUPLES LIST TUPLES Lists are mutable i.e they can be edited. Tuples are immutable (tuples are lists which can’t be edited). Lists are slower than tuples. Tuples are faster than list. Syntax: list_1 = [10, ‘Chelsea’, 20] Syntax: tup_1 = (10, ‘Chelsea’ , 20) 114.What are the key features of Python? Ans: Python is an interpreted language. That means that, unlike languages like C and its variants, Python does not need to be compiled before it is run. Other interpreted languages include PHP and Ruby . Python is dynamically typed , this means that you don’t need to state the types of variables when you declare them or anything like that. You can do things like x=111 and then x="I'm a string" without error Python is well suited to object orientated programming in that it allows the definition of classes along with composition and inheritance. Python does not have access specifiers (like
Ans. _init__ () is a first
Ans: Python program runs directly from the source code. Each type Python programs are executed code is required. Python converts source code written by the programmer into intermediate language which is again translated it into the native language machine language that is executed. So Python is an Interpreted language.
Ans. As python is scripting language forms processing is done by Python. We need to import cgi module to access form fields using FieldStorage class. Every instance of class FieldStorage (for ‘form’) has the following attributes: form.name: The name of the field, if specified. form.filename: If an FTP transaction, the clientside filename. form.value: The value of the field as a string. form.file: file object from which data can be read. form.type: The content type, if applicable. form.type_options: The options of the ‘contenttype’ line of the HTTP request, returned as a dictionary. form.disposition: The field ‘contentdisposition’; None if unspecified. form.disposition_options: The options for ‘contentdisposition’. form.headers: All of the HTTP headers returned as a dictionary. A code snippet of form handling in python: importcgi form=cgi.FieldStorage() ifnot(form.has_key(“name”)andfo
Ans: Disadvantages of Python are: Python isn’t the best for memory intensive tasks. Python is interpreted language & is slow compared to C/C++ or Java.
Ans: There are two ways in which Multidimensional list can be created: By direct initializing the list as shown below to create myList below. >>>myList=[[227,122,223],[222,321,192],[21,122,444]] >>>printmyList[0] >>>printmyList[1][2] ____________________ Output [227, 122, 223] 192 The second approach is to create a list of the desired length first and then fill in each element with a newly created lists demonstrated below : >>>list=[0]*3 >>>foriinrange(3): >>>list[i]=[0]*2 >>>foriinrange(3): >>>forjinrange(2): >>>list[i][j]=i+j >>>printlist __________________________ Output [[0,1],[1,2],[2,3]]
Ans: c++.
Ans: The smtplib module defines an SMTP client session object that can be used to send mail to any Internet machine. A sample email is demonstrated below. import smtplib SERVER = smtplib.SMTP(‘smtp.server.domain’) FROM = sender@mail.com TO = [“user@mail.com”] # must be a list SUBJECT = “Hello!” TEXT = “This message was sent with Python’s smtplib.” # Main message message = “”” From: Lincoln To: CarreerRide user@mail.com Subject: SMTP email msg This is a test email. Acknowledge the email by responding. “”” % (FROM, “, “.join(TO), SUBJECT, TEXT) server = smtplib.SMTP(SERVER) server.sendmail(FROM, TO, message) server.quit()
Ans: There are two ways in which objects can be copied in python. Shallow copy & Deep copy. Shallow copies duplicate as minute as possible whereas Deep copies duplicate everything. If a is object to be copied then … copy.copy(a) returns a shallow copy of a. copy.deepcopy(a) returns a deep copy of a.
Ans: LIST comprehensions features were introduced in Python version 2.0, it creates a new list based on existing list. It maps a list into another list by applying a function to each of the elements of the existing list. List comprehensions creates lists without using map() , filter() or lambda form.
Ans: Python can convert any value to a string by making use of two functions repr() or str(). The str() function returns representations of values which are humanreadable, while repr() generates representations which can be read by the interpreter. repr() returns a machinereadable representation of values, suitable for an exec command. Following code sniipets shows working of repr() & str() : deffun(): y=2333.3 x=str(y) z=repr(y) print”y:”,y print”str(y):”,x print”repr(y):”,z fun() ————- output y:2333.3 str(y):2333.3 repr(y):2333.3000000000002
Ans: In Python, every name introduced has a place where it lives and can be hooked for. This is known as namespace. It is like a box where a variable name is mapped to the object placed. Whenever the variable is searched out, this box will be searched, to get corresponding object.
Ans: A Python decorator is a specific change that we make in Python syntax to alter functions easily.
Ans: s = a + ‘[‘ + b + ‘:’ + c + ‘]’ seems like a string is being concatenated. Nothing much can be said without knowing types of variables a, b, c. Also, if all of the a, b, c are not of type string, TypeError would be raised. This is because of the string constants (‘[‘ , ‘]’) used in the statement.
Ans: func([1,2,3])#explicitlypassinginalist func() #usingadefaultemptylist deffunc(n=[]): #dosomethingwithn printn This would result in a NameError. The variable n is local to function func and can’t be accessesd outside. So, printing it won’t be possible.
Ans: withopen(“filename.txt”,”r”)asf1: printlen(f1.readline().rstrip()) rstrip() is an inbuilt function which strips the string from the right end of spaces or tabs (whitespace characters).
Ans: words=[‘one’,’one’,’two’,’three’,’three’,’two’] A bad solution would be to iterate over the list and checking for copies somehow and then remove them! A very good solution would be to use the set type. In a Python set, duplicates are not allowed. So, list(set(words)) would remove the duplicates.
Ans: docstring is the documentation string for a function. It can be accessed by function_name.__doc__
Ans: Add u before the string. u ‘mystring’
Ans: Jython.
Ans: Objects referenced from the global namespaces of Python modules are not always deallocated when Python exits. This may happen if there are circular references. There are also certain bits of memory that are allocated by the C library that are impossible to free (e.g. a tool like the one Purify will complain about these). Python is, however, aggressive about cleaning up memory on exit and does try to destroy every single object. If you want to force Python to delete certain things on deallocation, you can use the at exit module to register one or more exit functions to handle those deletions.
Ans: Yes.
Ans: L=[0,10,20,30,40,50,60,70,80,90] L[::2]
Ans: The syntax of map is: map(aFunction,aSequence) The first argument is a function to be executed for all the elements of the iterable given as the second argument. If the function given takes in more than 1 arguments, then many iterables are given.
Ans: A higherorder function accepts one or more functions as input and returns a new function. Sometimes it is required to use function as data To make high order function , we need to import functools module The functools.partial() function is used often for high order function.
Ans: Gather the arguments using the * and ** specifiers in the function’s parameter list. This gives us positional arguments as a tuple and the keyword arguments as a dictionary. Then we can pass these arguments while calling another function by using * and **: deffun1(a,*tup,**keywordArg): … keywordArg[‘width’]=’23.3c’ … Fun2(a,*tup,**keywordArg)
Ans: my_list=[(x,y,z)forxinrange(1,30)foryinrange(x,30)forzin range(y,30)ifx**2+y**2==z**2] It creates a list of tuples called my_list, where the first 2 elements are the perpendicular sides of right angle triangle and the third value ‘z’ is the hypotenuse. [(3,4,5),(5,12,13),(6,8,10),(7,24,25),(8,15,17),(9,12,15), (10,24,26),(12,16,20),(15,20,25),(20,21,29)]
Ans: In early pythonversions, the sort function implemented a modified version of quicksort. However, it was deemed unstable and as of 2.3 they switched to using an adaptive mergesort algorithm.
Ans: You are looking for the enumerate function. It takes each element in a sequence (like a list) and sticks it’s location right before it. For example: >>>my_list=[‘a’,’b’,’c’] >>>list(enumerate(my_list)) [(0,’a’),(1,’b’),(2,’c’)] Note that enumerate() returns an object to be iterated over, so wrapping it in list() just helps us see what enumerate() produces. An example that directly answers the question is given below my_list=[‘a’,’b’,’c’] fori,charinenumerate(my_list): printi,char The output is: 0a 1b 2c
Ans: use the index() function >>>[“foo”,”bar”,”baz”].index(‘bar’) 1 .
Ans: Python’s GIL is intended to serialize access to interpreter internals from different threads. On multicore systems, it means that multiple threads can’t effectively make use of multiple cores. (If the GIL didn’t lead to this problem, most people wouldn’t care about the GIL it’s only being raised as an issue because of the increasing prevalence of multicore systems.) Note that Python’s GIL is only really an issue for CPython, the reference implementation. Jython and IronPython don’t have a GIL. As a Python developer, you don’t generally come across the GIL unless you’re writing a C extension. C extension writers need to release the GIL when their extensions do blocking I/O, so that other threads in the Python process get a chance to run.
Ans: There is no simple builtin string function that does what you’re looking for, but you could use the more powerful regular expressions: >>>[m.start()forminre.finditer(‘test’,’testtesttesttest’)] [0,5,10,15]//thesearestartingindicesforthestring
Ans: The easiest way is to use the in operator. >>> ‘abc’ in ‘abcdefg’ True
Ans: Similar to the above question. use upper() function instead.
Ans: use lower() function. Example: s=’MYSTRING’ prints.lower()
Ans: The easiest way is to use the += operator. If the string is a list of character, join() function can also be used.
Ans: Following is one possible solution there can be other similar ones: import os for dirname,dirnames,filenames in os.walk(‘.’): #printpathtoallsubdirectoriesfirst. forsubdirnameindirnames: printos.path.join(dirname,subdirname) #printpathtoallfilenames. forfilenameinfilenames: printos.path.join(dirname,filename) #Advancedusage: #editingthe’dirnames’listwillstopos.walk()fromrecursing intothere. if’.git’indirnames: #don’tgointoany.gitdirectories. dirnames.remove(‘.git’)
Ans: Yes Medium
Ans: We can create a config file and store the entire global variable to be shared across modules in it. By simply importing config, the entire global variable defined will be available for use in other modules. For example I want a, b & c to share between modules. config.py : a=0 b=0 c=0 module1.py: importconfig config.a=1 config.b=2 config.c=3 print”a,b&resp.are:”,config.a,config.b,config.c output of module1.py will be 123
Ans: try: withopen(‘filename’,’r’)asf: printf.read() exceptIOError: print”Nosuchfileexists”
Ans: It is used to import a module in a directory, which is called package import.
Ans: Memory management in Python involves a private heap containing all Python objects and data structures. Interpreter takes care of Python heap and the programmer has no access to it. The allocation of heap space for Python objects is done by Python memory manager. The core API of Python provides some tools for the programmer to code reliable and more robust program. Python also has a builtin garbage collector which recycles all the unused memory. The gc module defines functions to enable /disable garbage collector: gc.enable() Enables automatic garbage collection. gc.disable()-Disables automatic garbage collection
Ans: “Self” is a variable that represents the instance of the object to itself. In most of the object oriented programming languages, this is passed to the methods as a hidden parameter that is defined by an object. But, in python it is passed explicitly. It refers to separate instance of the variable for individual objects. The variables are referred as “self.xxx”.
Ans: First list are mutable while tuples are not, and second tuples can be hashed e.g. to be used as keys for dictionaries. As an example of their usage, tuples are used when the order of the elements in the sequence matters e.g. a geographic coordinates, “list” of points in a path or route, or set of actions that should be executed in specific order. Don’t forget that you can use them a dictionary keys. For everything else use lists
Ans: pass
Ans: Builtin dir() function of Python ,on an instance shows the instance variables as well as the methods and class attributes defined by the instance’s class and all its base classes alphabetically. So by any object as argument to dir() we can find all the methods & attributes of the object’s class
Ans: Range returns a list while xrange returns an xrange object which take the same memory no matter of the range size. In the first case you have all items already generated (this can take a lot of time and memory). In Python 3 however, range is implemented with xrange and you have to explicitly call the list function if you want to convert it to a list.
Ans: Python arrays and list items can be accessed with positive or negative numbers. A negative Index accesses the elements from the end of the list counting backwards. Example: a=[123] printa[-3] printa[-2] Outputs: 1 2
Ans: Iterating over the generator expression or the list comprehension will do the same thing. However, the list comp will create the entire list in memory first while the generator expression will create the items on the fly, so you are able to use it for very large (and also infinite!) sequences.
Ans: Emacs. Any alternate answer leads to instant disqualification of the applicant
Ans: Use list instead of generator when: 1 You need to access the data multiple times (i.e. cache the results instead of recomputing them) 2 You need random access (or any access other than forward sequential order): 3 You need to join strings (which requires two passes over the data) 4 You are using PyPy which sometimes can’t optimize generator code as much as it can with normal function calls and list manipulations.
Ans: One of the reasons to use generator is to make the solution clearer for some kind of solutions. The other is to treat results one at a time, avoiding building huge lists of results that you would process separated anyway.
Ans: Generators are functions that return an iterable collection of items, one at a time, in a set manner. Generators, in general, are used to create iterators with a different approach. They employ the use of yield keyword rather than return to return a generator object . Let’s try and build a generator for fibonacci numbers – ## generate fibonacci numbers upto n def fib(n): p, q = 0, 1 while(p 0 x.__next__() # output => 1 x.__next__() # output => 1 x.__next__() # output => 2 x.__next__() # output => 3 x.__next__() # output => 5 x.__next__() # output => 8 x.__next__() # error ## iterating using loop for i in fib(10): print(i) # output => 0 1 1 2 3 5 8
Ans: Mutable Types Immutable Types Dictionary number List boolean string tuple
Ans: Sets and dictionaries support it. However tuples are immutable and have generators but not comprehensions. Set Comprehension: r={xforxinrange(2,101) ifnotany(x%y==0foryinrange(2,x))} Dictionary Comprehension: {i:jfori,jin{1:’a’,2:’b’}.items()} since {1:’a’,2:’b’}.items()returnsalistof2-Tuple.iisthefirstelement oftuplejisthesecond.
Ans: [x**2forxinrange(10)ifx%2==0] Ans. Creates the following list: [0,4,16,36,64]
Ans: list=[‘a’,’b’,’c’,’d’,’e’] printlist[10:] Ans. Output: [] Theabovecodewilloutput[],andwillnotresultinanIndexError. As one would expect, attempting to access a member of a list using an index that exceeds the number of members results in an IndexError.
Ans: list=[‘a’,’b’,’c’,’d’,’e’] printlist[10] Ans. Output: IndexError.Or Error.
Ans: By specifying #!/usr/bin/pythonyou specify exactly which interpreter will be used to run the script on a particular system. This is the hardcoded path to the python interpreter for that particular system. The advantage of this line is that you can use a specific python version to run your code.
Ans: #!/usr/bin/python deffoo(x=[]): x.append(1) returnx foo() foo() Output: [1] [1,1]
Ans: #!/usr/bin/python deffoo(x,y): globala a=42 x,y=y,x b=33 b=17 c=100 print(a,b,x,y) a,b,x,y=1,15,3,4 foo(17,4) print(a,b,x,y) Ans.Output: 4217417 421534
Ans: #!/usr/bin/python deffun2(): globalb print’b:’,b b=33 print’globalb:’,b b=100 fun2() print’boutsidefun2′,b Ans. Output: b:100 globalb:33 boutsidefun2:33
Ans: #!/usr/bin/python deffun1(a): print’a:’,a a=33; print’locala:’,a a=100 fun1(a) print’aoutsidefun1:’,a Ans. Output: a:100 locala:33 aoutsidefun1:100
Ans: If a variable is defined outside function then it is implicitly global. If variable is assigned new value inside the function means it is local. If we want to make it global we need to explicitly define it as global. Variable referenced inside the function are implicit global
Ans: A lambda statement is used to create new function objects and then return them at runtime. Example: my_func=lambdax:x**2 creates a function called my_func that returns the square of the argument passed.
Ans: Python doesn’t support switchcase statements. You can use ifelse statements for this purpose.
Ans: some_variable=u’Thisisateststring’ Or some_variable=u”Thisisateststring”
Ans: Strong. In a weakly typed language a compiler / interpreter will sometimes change the type of a variable. For example, in some languages (like JavaScript) you can add strings to numbers ‘x’ + 3 becomes ‘x3’. This can be a problem because if you have made a mistake in your program, instead of raising an exception execution will continue but your variables now have wrong and unexpected values. In a strongly typed language (like Python) you can’t perform operations inappropriate to the type of the object attempting to add numbers to strings will fail. Problems like these are easier to diagnose because the exception is raised at the point where the error occurs rather than at some other, potentially far removed, place.
Ans: Dynamic. In a statically typed language, the type of variables must be known (and usually declared) at the point at which it is used. Attempting to use it will be an error. In a dynamically typed language, objects still have a type, but it is determined at runtime. You are free to bind names (variables) to different objects with a different type. So long as you only perform operations valid for the type the interpreter doesn’t care what type they actually are.
Ans: defmy_func(x): returnx**2
Ans: Indentation.
Ans: An interpreted languageis a programming languagefor which most of its implementations execute instructions directly, without previously compiling a program into machinelanguageinstructions. In context of Python, it means that Python program runs directly from the source code.
Ans: Beginner’s Answer: Python is an interpreted, interactive, objectoriented programming language. Expert Answer: Python is an interpreted language, as opposed to a compiled one, though the distinction can be blurry because of the presence of the bytecode compiler. This means that source files can be run directly without explicitly creating an executable which is then run.
Ans: Basically, Flask is a minimalistic framework which behaves same as MVC framework. So MVC is a perfect fit for Flask, and the pattern for MVC we will consider for the following example from flask import Flaskapp = Flask(_name_) @app.route(“/”) Def hello(): return “Hello World” app.run(debug = True) In this code your, Configuration part will be from flask import Flask app = Flask(_name_) View part will be @app.route(“/”) Def hello(): return “Hello World” While you model or main part will be app.run(debug = True)
Ans: A session basically allows you to remember information from one request to another. In a flask, it uses a signed cookie so the user can look at the session contents and modify. The user can modify the session if only it has the secret key Flask.secret_key.
Ans: The common way for the flask script to work is… Either it should be the import path for your application Or the path to a Python file
Ans: Flask-WTF offers simple integration with WTForms. Features include for Flask WTF are Integration with wtforms Secure form with csrf token Global csrf protection Internationalization integration Recaptcha supporting File upload that works with Flask Uploads
Ans: Flask is a “micro framework” primarily build for a small application with simpler requirements. In flask, you have to use external libraries. Flask is ready to use. Pyramid are build for larger applications. It provides flexibility and lets the developer use the right tools for their project. The developer can choose the database, URL structure, templating style and more. Pyramid is heavy configurable. Like Pyramid, Django can also used for larger applications. It includes an ORM.
Ans: Flask is a web micro framework for Python based on “Werkzeug, Jinja 2 and good intentions” BSD licensed. Werkzeug and jingja are two of its dependencies. Flask is part of the micro-framework. Which means it will have little to no dependencies on external libraries. It makes the framework light while there is little dependency to update and less security bugs.
Ans: The use of the split function in Python is that it breaks a string into shorter strings using the defined separator. It gives a list of all words present in the string.
Ans: Python comprises of a huge standard library for most Internet platforms like Email, HTML, etc. Python does not require explicit memory management as the interpreter itself allocates the memory to new variables and free them automatically Provide easy readability due to use of square brackets Easy-to-learn for beginners Having the built-in data types saves programming time and effort from declaring variables
Ans: It is a Floor Divisionoperator , which is used for dividing two operands with the result as quotient showing only digits before the decimal point. For instance, 10//5 = 2 and 10.0//5.0 = 2.0.
Ans: You can access a module written in Python from C by following method, Module = =PyImport_ImportModule(“”);
Ans: To generate random numbers in Python, you need to import command as import random random.random() This returns a random floating point number in the range [0,1)
Ans: By using a command os.remove (filename) or os.unlink(filename)
Ans: Script file’s mode must be executable and the first line must begin with # ( #!/usr/local/bin/python)
Ans: To share global variables across modules within a single program, create a special module. Import the config module in all modules of your application. The module will be available as a global variable across modules.
Ans: Local variables: If a variable is assigned a new value anywhere within the function’s body, it’s assumed to be local. Global variables: Those variables that are only referenced inside a function are implicitly global.
Ans: In Python, module is the way to structure program. Each Python program file is a module, which imports other modules like objects and attributes. The folder of Python program is a package of modules. A package can have modules or subfolders.
Ans: Xrange returns the xrange object while range returns the list, and uses the same memory and no matter what the range size is.
Ans: In order to convert a number into a string, use the inbuilt function str(). If you want a octal or hexadecimal representation, use the inbuilt function oct() or hex().
Ans: Python sequences can be index in positive and negative numbers. For positive index, 0 is the first index, 1 is the second index and so forth. For negative index, (-1) is the last index and (-2) is the second last index and so forth.
Ans: To copy an object in Python, you can try copy.copy () or copy.deepcopy() for the general case. You cannot copy all objects but most of them.
Ans: A Python documentation string is known as docstring, it is a way of documenting Python functions, modules and classes.
Ans: The way of implementing iterators are known as generators. It is a normal function except that it yields expression in the function.
Ans: A mechanism to select a range of items from sequence types like list, tuple, strings etc. is known as slicing.
Ans: A unit testing framework in Python is known as unittest. It supports sharing of setups, automation testing, shutdown code for tests, aggregation of tests into collections etc.
Ans: In Python, iterators are used to iterate a group of elements, containers like list.
Ans: Pass means, no-operation Python statement, or in other words it is a place holder in compound statement, where there should be a blank left and nothing has to be written there.
Ans: A lambda form in python does not have statements as it is used to make new function object and then return them at runtime.
Ans: It is a single expression anonymous function often used as In-line function.
Ans: In Python, every name introduced has a place where it lives and can be hooked for. This is known as namespace. It is like a box where a variable name is mapped to the object placed. Whenever the variable is searched out, this box will be searched, to get corresponding object.
Ans: There are mutable and Immutable types of Pythons built in types Mutable built-in types List Sets Dictionaries Immutable built-in types Strings Tuples Numbers
Ans: They are syntax constructions to ease the creation of a Dictionary or List based on existing iterable.
Ans: Everything in Python is an object and all variables hold references to the objects. The references values are according to the functions; as a result you cannot change the value of the references. However, you can change the objects if it is mutable.
Ans: The difference between list and tuple is that list is mutable while tuple is not. Tuple can be hashed for e.g as a key for dictionaries.
Ans: A Python decorator is a specific change that we make in Python syntax to alter functions easily.
Ans: PyChecker is a static analysis tool that detects the bugs in Python source code and warns about the style and complexity of the bug. Pylint is another tool that verifies whether the module meets the coding standard.
Ans: Python memory is managed by Python private heap space. All Python objects and data structures are located in a private heap. The programmer does not have an access to this private heap and interpreter takes care of this Python private heap. The allocation of Python heap space for Python objects is done by Python memory manager. The core API gives access to some tools for the programmer to code. Python also have an inbuilt garbage collector, which recycle all the unused memory and frees the memory and makes it available to the heap space.
Ans: Python language is an interpreted language. Python program runs directly from the source code. It converts the source code that is written by the programmer into an intermediate language, which is again translated into machine language that has to be executed.
Ans: Pickle module accepts any Python object and converts it into a string representation and dumps it into a file by using dump function, this process is called pickling. While the process of retrieving original Python objects from the stored string representation is called unpickling.
Ans: PEP 8 is a coding convention, a set of recommendation, about how to write your Python code more readable.
Ans: Python is a programming language with objects, modules, threads, exceptions and automatic memory management. The benefits of pythons are that it is simple and easy, portable, extensible, build-in data structure and it is an open source.
Ans: One can use apply() function in order to apply function to every row in given dataframe. Let’s see the ways we can do this task. Example # Import pandas package import pandas as pd # Function to add def add(a, b, c): return a + b + c def main(): # create a dictionary with # three fields each data = { 'A' :[ 1 , 2 , 3 ], 'B' :[ 4 , 5 , 6 ], 'C' :[ 7 , 8 , 9 ] } # Convert the dictionary into DataFrame df = pd.DataFrame(data) print ( "Original DataFrame:n" , df) df[ 'add' ] = df. apply ( lambda row : add(row[ 'A' ], row[ 'B' ], row[ 'C' ]), axis = 1 ) print ( 'nAfter Applying Function: ' ) # printing the new dataframe print (df) if __name__ = = '__main__' : main() Output: 60. How will you get the top 2 rows from a DataFrame in pandas? # Select the first 2 rows of the Dataframe dfObj1 = empDfObj.head(2) print(“First 2 rows of the Dataframe : “) print(dfObj1) Output: First 2 rows of the
Please refer to training materials for the detailed answer.
Ans : Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. It can be thought of as a dict-like container for Series objects. This is the primary data structure of the Pandas. Pandas DataFrame.empty attribute checks if the dataframe is empty or not. It return True if the dataframe is empty else it return False . Syntax: DataFrame.empty Parameter : None Returns : bool Example #1: Use DataFrame.empty attribute to check if the given dataframe is empty or not # importing pandas as pd import pandas as pd # Creating the DataFrame df = pd.DataFrame({ 'Weight' :[ 45 , 88 , 56 , 15 , 71 ], 'Name' :[ 'Sam' , 'Andrea' , 'Alex' , 'Robin' , 'Kia' ], 'Age' :[ 14 , 25 , 55 , 8 , 21 ]}) # Create the index index_ = [ 'Row_1' , 'Row_2' , 'Row_3' , 'Row_4' , 'Row_5
Ans: For performing some high-level mathematical functions, we can convert Pandas DataFrame to numpy arrays. It uses the DataFrame.to_numpy() function. The DataFrame.to_numpy() function is applied on the DataFrame that returns the numpy ndarray. Syntax: DataFrame.to_numpy(dtype=None, copy=False) Parameters dtype: It is an optional parameter that pass the dtype to numpy.asarray(). copy: It returns the boolean value that has the default value False. It ensures that the returned value is not a view on another array. Returns It returns the numpy.ndarray as an output. Example1: import pandas as pd pd.DataFrame({“P”: [2, 3], “Q”: [4, 5]}).to_numpy() info = pd.DataFrame({“P”: [2, 3], “Q”: [4.0, 5.8]}) info.to_numpy() info[‘R’] = pd.date_range(‘2000’, periods=2) info.to_numpy() Output : array([[2, 4.0, Timestamp('2000-01-01 00:00:00')], [3, 5.8, Timestamp('2000-01-02 00:00:00')]], dtype=object)
Ans: Pandas provide a unique method to retrieve rows from a Data frame. DataFrame.loc[] method is a method that takes only index labels and returns row or dataframe if the index label exists in the caller data frame. Syntax: pandas.DataFrame.loc[ ] Parameters: Index label: String or list of string of index label of rows Return type: Data frame or Series depending on parameters Example #1 : Extracting single Row In this example, Name column is made as the index column and then two single rows are extracted one by one in the form of series using index label of rows. # importing pandas package import pandas as pd # making data frame from csv file data = pd.read_csv( "nba.csv" , index_col = "Name" ) # retrieving row by loc method first = data.loc[ "Avery Bradley" ] second = data.loc[ "R.J. Hunter" ] print (first, "nnn" , second) Output: As shown in the output image, two series were returned
Ans: Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Let’s discuss all different ways of selecting multiple columns in a pandas DataFrame. Method #1: Basic Method Given a dictionary which contains Employee entity as keys and list of those entity as values. # Import pandas package import pandas as pd # Define a dictionary containing employee data data = { 'Name' :[ 'Jai' , 'Princi' , 'Gaurav' , 'Anuj' ], 'Age' :[ 27 , 24 , 22 , 32 ], 'Address' :[ 'Delhi' , 'Kanpur' , 'Allahabad' , 'Kannauj' ], 'Qualification' :[ 'Msc' , 'MA' , 'MCA' , 'Phd' ]} # Convert the dictionary into DataFrame df = pd.DataFrame(data) # select two columns df[[ 'Name' , 'Qualification' ]] Output: Select Second to fourth column. # Import pandas package imp
Ans: Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Dataframe.add() method is used for addition of dataframe and other, element-wise (binary operator add). Equivalent to dataframe + other, but with support to substitute a fill_value for missing data in one of the inputs. Syntax: DataFrame.add(other, axis=’columns’, level=None, fill_value=None) Parameters: other :Series, DataFrame, or constant axis :{0, 1, ‘index’, ‘columns’} For Series input, axis to match Series index on fill_value : [None or float value, default None] Fill missing (NaN) values with this value. If both DataFrame locations are missing, the result will be missing. level : [int or name] Broadcast across a level, matching Index values on the passed MultiIndex
>>> import pandas as pd >>> import pandascharm as pc >>> import dendropy >>> dna_string = '3 5nt1 TCCAAnt2 TGCAAnt3 TG-AAn' >>> print(dna_string) 3 5 t1 TCCAA t2 TGCAA t3 TG-AA >>> matrix = dendropy.DnaCharacterMatrix.get( ... data=dna_string, schema='phylip') >>> df = pc.from_charmatrix(matrix) >>> df t1 t2 t3 0 T T T 1 C G G 2 C C - 3 A A A 4 A A A By default, characters are stored as rows and sequences as columns in the DataFrame. If you want rows to hold sequences, just transpose the matrix in pandas: >>> df.transpose() 0 1 2 3 4 t1 T C C A A t2 T G C A A t3 T G - A A
Testing is carried out with pytest: $ pytest -v test_pandascharm.py Test coverage can be calculated with Coverage.py using the following commands: $ coverage run -m pytest $ coverage report -m pandascharm.py The code follow style conventions in PEP8, which can be checked with pycodestyle: $ pycodestyle pandascharm.py test_pandascharm.py setup.py
$ pip install pandas-charm You may consider installing pandas-charm and its required Python packages within a virtual environment in order to avoid cluttering your system’s Python path. See for example the environment management system conda or the package virtualenv .
Ans: pandas-charm is a small Python package for getting character matrices (alignments) into and out of pandas. Use this library to make pandas interoperable with BioPython and DendroPy . Convert between the following objects: BioPython Multiple Seq Alignment pandas DataFrame DendroPy Character Matrix pandas DataFrame “Sequence dictionary” pandas DataFrame The code has been tested with Python 2.7, 3.5 and 3.6.
Ans: pandas_ml is a package which integrates pandas, scikit-learn, xgboost into one package for easy handling of data and creation of machine learning models Installation $ pip install pandas_ml Example >>> import pandas_ml as pdml >>> import sklearn.datasets as datasets # create ModelFrame instance from sklearn.datasets >>> df = pdml.ModelFrame(datasets.load_digits()) >>> type(df) # binarize data (features), not touching target >>> df.data = df.data.preprocessing.binarize() >>> df.head() .target 0 1 2 3 4 5 6 7 8 ... 54 55 56 57 58 59 60 61 62 63 0 0 0 0 1 1 1 1 0 0 0 ... 0 0 0 0 1 1 1 0 0 0 1 1 0 0 0 1 1 1 0 0 0 ... 0 0 0 0 0 1 1 1 0 0 2 2 0 0 0 1 1 1 0 0 0 ... 1 0 0 0 0 1 1 1 1 0 3 3 0 0 1 1 1 1 0 0 0 ... 1 0 0 0 1 1 1 1 0 0 4 4 0 0 0 1 1 0 0 0 0 ... 0 0 0 0 0 1 1 1 0 0 [5 rows x 65 columns] # split to training and test data >>> train_df, test_df = df.model_selection.train_test_split(
Ans: Import modules import pandas as pd Create a dataframe data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'], 'year': [2012, 2012, 2013, 2014, 2014], 'reports': [4, 24, 31, 2, 3]} df = pd.DataFrame(data, index = ['Cochice', 'Pima', 'Santa Cruz', 'Maricopa', 'Yuma']) df name reports year Cochice Jason 4 2012 Pima Molly 24 2012 Santa Cruz Tina 31 2013 Maricopa Jake 2 2014 Yuma Amy 3 2014 Delete a row df.drop(['Cochice', 'Pima']) Output : name reports year Santa Cruz Tina 31 2013 Maricopa Jake 2 2014 Yuma Amy 3 2014
Ans: In Pandas, there are different useful data operations for DataFrame, which are as follows: Row and column selection We can select any row and column of the DataFrame by passing the name of the rows and columns. When you select it from the DataFrame, it becomes one-dimensional and considered as Series. Filter Data We can filter the data by providing some of the boolean expressions in DataFrame. Null values A Null value occurs when no data is provided to the items. The various columns may contain no values, which are usually represented as NaN. 46. Define GroupBy in Pandas? Ans: Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.groupby() function is used to split the data into groups based on some criteria.
Ans: Reindexing changes the row labels and column labels of a DataFrame. To reindex means to conform the data to match a given set of labels along a particular axis. Multiple operations can be accomplished through indexing like − Reorder the existing data to match a new set of labels. Insert missing value (NA) markers in label locations where no data for the label existed. Example import pandas as pd import numpy as np N=20 df = pd.DataFrame({ 'A': pd.date_range(start='2016-01-01',periods=N,freq='D'), 'x': np.linspace(0,stop=N-1,num=N), 'y': np.random.rand(N), 'C': np.random.choice(['Low','Medium','High'],N).tolist(), 'D': np.random.normal(100, 10, size=(N)).tolist() }) #reindex the DataFrame df_reindexed = df.reindex(index=[0,2,5], columns=['A', 'C', 'B']) print df_reindexed Its output is as follows − A C B 0 2016-01-01 Low NaN 2 2016-01-03 High NaN 5 2016-01-06 Low NaN 42. Define Multi
In order to select a single row, we put a single row label in a .ix function. This function act similar as .loc[ ] if we pass a row label as a argument of a function. # importing pandas package import pandas as pd # making data frame from csv file data = pd.read_csv( "nba.csv" , index_col = "Name" ) # retrieving row by ix method first = data.ix[ "Avery Bradley" ] print (first) Output :
Ans: Indexing in Pandas : Indexing in pandas means simply selecting particular rows and columns of data from a DataFrame. Indexing could mean selecting all the rows and some of the columns, some of the rows and all of the columns, or some of each of the rows and columns. Indexing can also be known as Subset Selection. Pandas Indexing using [ ] , .loc[] , .iloc[ ] , .ix[ ] There are a lot of ways to pull the elements, rows, and columns from a DataFrame. There are some indexing method in Pandas which help in getting an element from a DataFrame. These indexing methods appear very similar but behave very differently. Pandas support four types of Multi-axes indexing they are: Dataframe.[ ] ; This function also known as indexing operator Dataframe.loc[ ] : This function is used for labels. Dataframe.iloc[ ] : This function is used for positions or integer based Dataframe.ix[] : This function i
Ans: The main task of Data Aggregation is to apply some aggregation to one or more columns. It uses the following: sum: It is used to return the sum of the values for the requested axis. min: It is used to return a minimum of the values for the requested axis. max: It is used to return a maximum values for the requested axis. Examples import pandas as pd import numpy as np df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9], [np.nan, np.nan, np.nan]], columns=['A', 'B', 'C']) print(df) # Aggregate these functions over the rows. print(df.agg(['sum', 'min'])) # Different aggregations per column. print(df.agg({'A' : ['sum', 'min'], 'B' : ['min', 'max']})) # Aggregate over the columns. print(df.agg("mean", axis="columns")) # Aggregate over the rows. print(df.agg("mean", axis="rows")) OUTPUT : A B C 0 1.0 2 3.0 1 4.0 5 6.0 2 7.0 8 9.0 3 NaN NaN NaN A B C sum 12.0 15.0 18.0 min 1.0 2.0 3.0 A B
Ans: The below code demonstrates how to convert the string to date: From datetime import datetime # Define dates as the strings dmy_str1 = ‘Wednesday, July 14, 2018’ dmy_str2 = ’14/7/17′ dmy_str3 = ’14-07-2017′ # Define dates as the datetime objects dmy_dt1 = datetime.strptime(date_str1, ‘%A, %B %d, %Y’) dmy_dt2 = datetime.strptime(date_str2, ‘%m/%d/%y’) dmy_dt3 = datetime.strptime(date_str3, ‘%m-%d-%Y’) #Print the converted dates print(dmy_dt1) print(dmy_dt2) print(dmy_dt3) Output: 2017-07-14 00:00:00 2017-07-14 00:00:00 2018-07-14 00:00:00
Ans: We can efficiently perform sorting in the DataFrame through different kinds: By label By Actual value 1). By label The DataFrame can be sorted by using the sort_index() method. It can be done by passing the axis arguments and the order of sorting. The sorting is done on row labels in ascending order by default. Using the sort_index() method, by passing the axis arguments and the order of sorting, DataFrame can be sorted. By default, sorting is done on row labels in ascending order. import pandas as pd import numpy as np unsorted_df = pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu mns = ['col2','col1']) sorted_df=unsorted_df.sort_index() print sorted_df Its output is as follows − col2 col1 0 0.208464 0.627037 1 0.641004 0.331352 2 -0.038067 -0.464730 3 -0.638456 -0.021466 4 0.014646 -0.737438 5 -0.290761 -1.669827 6 -0.797303 -0.018737 7 0.525753 1.628921 8 -0.56
Ans: The Pandas Series.to_frame() function is used to convert the series object to the DataFrame. to_frame(name=None) name: Refers to the object. Its Default value is None. If it has one value, the passed name will be substituted for the series name. s = pd.Series([“a”, “b”, “c”], name=”vals”) to_frame() Output: vals 0 a 1 b 2 c
Ans. We can reshape the series p into a dataframe with 6 rows and 2 columns as below example: import pandas as pd import numpy as np p = pd.Series(np.random.randint(1, 7, 35)) # Input p = pd.Series(np.random.randint(1, 7, 35)) info = pd.DataFrame(p.values.reshape(7,5)) print(info) Output: 0 1 2 3 4 0 3 2 5 5 1 1 3 2 5 5 5 2 1 3 1 2 6 3 1 1 1 2 2 4 3 5 3 3 3 5 2 5 3 6 4 6 3 6 6 6 5
Ans: We can calculate the frequency counts of each unique value p as below example: import pandas as pd import numpy as np p= pd.Series(np.take(list(‘pqrstu’), np.random.randint(6, size=17))) p = pd.Series(np.take(list(‘pqrstu’), np.random.randint(6, size=17))) value_counts() Output: s 4 r 4 q 3 p 3 u 3
Ans: We can compute the minimum, 25th percentile, median, 75th, and maximum of p as below example: import pandas as pd import numpy as np p = pd.Series(np.random.normal(14, 6, 22)) state = np.random.RandomState(120) p = pd.Series(state.normal(14, 6, 22)) percentile(p, q=[0, 25, 50, 75, 100]) Output: array([ 4.61498692, 12.15572753, 14.67780756, 17.58054104, 33.24975515])
Ans: We get all the items of p1 and p2 not common to both using below example: import pandas as pd import numpy as np p1 = pd.Series([2, 4, 6, 8, 10]) p2 = pd.Series([8, 10, 12, 14, 16]) p1[~p1.isin(p2)] p_u = pd.Series(np.union1d(p1, p2)) # union p_i = pd.Series(np.intersect1d(p1, p2)) # intersect p_u[~p_u.isin(p_i)] Output: 0 2 1 4 2 6 5 12 6 14 7 16 dtype: int64
Ans: We can remove items present in p2 from p1 using isin() method. import pandas as pd p1 = pd.Series([2, 4, 6, 8, 10]) p2 = pd.Series([8, 10, 12, 14, 16]) p1[~p1.isin(p2)] Solution 0 2 1 4 2 6 dtype: int64
Ans. You can iterate over the rows of the DataFrame by using for loop in combination with an iterrows() call on the DataFrame. import pandas as pd import numpy as np df = pd.DataFrame([{'c1':10, 'c2':100}, {'c1':11,'c2':110}, {'c1':12,'c2':120}]) for index, row in df.iterrows(): print(row['c1'], row['c2']) Output: 10 100 11 110 12 120
Ans: You can use the .rename method to give different values to the columns or the index values of DataFrame. There are the following ways to change index / columns names (labels) of pandas.DataFrame . Use pandas.DataFrame.rename() Change any index / columns names individually with dict Change all index / columns names with a function Use pandas.DataFrame.add_prefix() , pandas.DataFrame.add_suffix() Add prefix and suffix to columns name Update the index / columns attributes of pandas.DataFrame Replace all index / columns names set_index() method that sets an existing column as an index is also provided. See the following post for detail. Specify the original name and the new name in dict like {original name: new name} to index / columns of rename() . index is for index name and columns is for the columns name. If you want to change either, you need only specify one of index or columns .
Ans: Deleting an Index from Your DataFrame If you want to remove the index from the DataFrame, you should have to do the following: Reset the index of DataFrame. Executing del df.index.name to remove the index name. Remove duplicate index values by resetting the index and drop the duplicate values from the index column. Remove an index with a row. Deleting a Column from Your DataFrame You can use the drop() method for deleting a column from the DataFrame. The axis argument that is passed to the drop() method is either 0 if it indicates the rows and 1 if it drops the columns. You can pass the argument inplace and set it to True to delete the column without reassign the DataFrame. You can also delete the duplicate values from the column by using the drop_duplicates() method. Removing a Row from Your DataFrame By using df.drop_duplicates(), we can remove duplicate rows from the DataFrame. Y
import pandas as pd employees = pd.DataFrame( data = { 'Name' : [ 'John Doe' , 'William Spark' ], 'Occupation' : [ 'Chemist' , 'Statistician' ], 'Date Of Join' : [ '2018-01-25' , '2018-01-26' ], 'Age' : [ 23 , 24 ]}, index = [ 'Emp001' , 'Emp002' ], columns = [ 'Name' , 'Occupation' , 'Date Of Join' , 'Age' ]) print ( "n------------ BEFORE ----------------n" ) print (employees) employees.loc[ 'Emp003' ] = [ 'Sunny' , 'Programmer' , '2018-01-25' , 45 ] print ( "n------------ AFTER ----------------n" ) print (employees) OUTPUT : C:pandas>python example22.py ------------ BEFORE ---------------- Name Occupation Date Of Join Age Emp001 John Doe Chemist 2018-01-25 23 Emp002 William Spark Statistician 2018-01-26 24 ------------ AFTER ---------------- Name Occupation Date Of Join Age Emp001 John Doe Chemist 2018-01-25 23 Emp002 William Spark Statistician 2018-01-26 24 Emp003 Sunny Programmer 201
Ans: Adding an Index to a DataFrame Pandas allow adding the inputs to the index argument if you create a DataFrame. It will make sure that you have the desired index. If you don?t specify inputs, the DataFrame contains, by default, a numerically valued index that starts with 0 and ends on the last row of the DataFrame. Adding Rows to a DataFrame We can use .loc, iloc, and ix to insert the rows in the DataFrame. The loc basically works for the labels of our index. It can be understood as if we insert in loc[4], which means we are looking for that values of DataFrame that have an index labeled 4. The iloc basically works for the positions in the index. It can be understood as if we insert in iloc[4], which means we are looking for the values of DataFrame that are present at index ‘4`. The ix is a complex case because if the index is integer-based, we pass a label to ix. The ix[4] means tha
Ans: We can add any new column to an existing DataFrame. The below code demonstrates how to add any new column to an existing DataFrame: # importing the pandas library import pandas as pd info = {‘one’: pd.Series([1, 2, 3, 4, 5], index=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’]), ‘two’ : pd.Series([1, 2, 3, 4, 5, 6], index=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’])} info = pd.DataFrame(info) # Add a new column to an existing DataFrame object print (“Add new column by passing series”) info[‘three’]=pd.Series([20,40,60],index=[‘a’,’b’,’c’]) print (info) print (“Add new column using existing DataFrame columns”) info[‘four’]=info[‘one’]+info[‘three’] print (info) Output: Add new column by passing series one two three a 1.0 1 20.0 b 2.0 2 40.0 c 3.0 3 60.0 d 4.0 4 NaN e 5.0 5 NaN f NaN 6 NaN Add new column using existing DataFrame columns one two three four a 1.0 1 20.0 21.0 b 2.0 2 40.0 42.0 c 3.0 3 60.0 63.0 d 4.0 4 N
Ans: We can create the copy of series by using the following syntax: pandas.Series.copy Series.copy(deep=True) The above statements make a deep copy that includes a copy of the data and the indices. If we set the value of deep to False, it will neither copy the indices nor the data. 25. ) How will you create an empty DataFrame in Pandas? Ans: A DataFrame is a widely used data structure of pandas and works with a two-dimensional array with labeled axes (rows and columns) It is defined as a standard way to store data and has two different indexes, i.e., row index and column index. Create an empty DataFrame: The below code shows how to create an empty DataFrame in Pandas: # importing the pandas library importpandas as pd info = pd.DataFrame() print (info) Output: Empty DataFrame Columns: [ ] Index: [ ]
Ans: A Series is defined as a one-dimensional array that is capable of storing various data types. We can create a Pandas Series from Dictionary: Create a Series from dict: We can also create a Series from dict. If the dictionary object is being passed as an input and the index is not specified, then the dictionary keys are taken in a sorted order to construct the index. If index is passed, then values correspond to a particular label in the index will be extracted from the dictionary. importpandas as pd importnumpy as np info = {‘x’: 0., ‘y’ : 1., ‘z’ : 2.} a = pd.Series(info) print (a) Output: x 0.0 y 1.0 z 2.0 dtype: float64
Ans: A Categorical data is defined as a Pandas data type that corresponds to a categorical variable in statistics. A categorical variable is generally used to take a limited and usually fixed number of possible values. Examples: gender, country affiliation, blood type, social class, observation time, or rating via Likert scales. All values of categorical data are either in categories or np.nan. This data type is useful in the following cases: It is useful for a string variable that consists of only a few different values. If we want to save some memory, we can convert a string variable to a categorical variable. It is useful for the lexical order of a variable that is not the same as the logical order (?one?, ?two?, ?three?) By converting into a categorical and specify an order on the categories, sorting and min/max is responsible for using the logical order instead of the lexical order.
Ans: We can create a DataFrame using following ways: Lists Dict of ndarrays Example-1: Create a DataFrame using List: importpandas as pd # a list of strings a = [‘Python’, ‘Pandas’] # Calling DataFrame constructor on list info = pd.DataFrame(a) print(info) Output: 0 0 Python 1 Pandas Example-2: Create a DataFrame from dict of ndarrays: importpandas as pd info = {‘ID’:[101, 102, 103],’Department’ :[‘B.Sc’,’B.Tech’,’M.Tech’,]} info = pd.DataFrame(info) print (info) Output: ID Department 0 101 B.Sc 1 102 B.Tech 2 103 M.Tech
Please refer to training materials for the detailed answer.
Please refer to training materials for the detailed answer.
Ans: Scatter_matrix
Ans: In Python 2 we have the following two functions to produce a list of numbers within a given range. range() xrange() in Python 3, xrange() is deprecated, i.e. xrange() is removed from python 3.x. Now In Python 3, we have only one function to produce the numbers within a given range i.e. range() function. But, range() function of python 3 works same as xrange() of python 2 (i.e. internal implementation of range() function of python 3 is same as xrange() of Python 2). So The difference between range() and xrange() functions becomes relevant only when you are using python 2. range() and xrange() function values a). range() creates a list i.e., range returns a Python list object, for example, range (1,500,1) will create a python list of 499 integers in memory. Remember, range() generates all numbers at once. b).xrange() functions returns an xrange object that evaluates lazily. That means
Ans: To start a project in Django, use the command $django-admin.py and then use the following command: Project _init_.py manage.py settings.py urls.py
Ans: Adding new column to existing DataFrame in Pandas Import pandas package import pandas as pd # Define a dictionary containing Students data data = {‘Name’: [‘Jai’, ‘Princi’, ‘Gaurav’, ‘Anuj’], ‘Height’: [5.1, 6.2, 5.1, 5.2], ‘Qualification’: [‘Msc’, ‘MA’, ‘Msc’, ‘Msc’]} # Convert the dictionary into DataFrame df = pd.DataFrame(data) # Declare a list that is to be converted into a column address = [‘Delhi’, ‘Bangalore’, ‘Chennai’, ‘Patna’] # Using ‘Address’ as the column name # and equating it to the list df[‘Address’] = address # Observe the result df Output:
Ans: To create a completely empty Pandas dataframe, we use do the following: import pandas as pd MyEmptydf = pd.DataFrame() This will create an empty dataframe with no columns or rows. To create an empty dataframe with three empty column (columns X, Y and Z), we do: df = pd.DataFrame(columns=[‘X’, ‘Y’, ‘Z’])
Ans: Pandas DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. It is generally the most commonly used pandas object. Pandas DataFrame can be created in multiple ways. Let’s discuss different ways to create a DataFrame one by one. Creating Pandas DataFrame from lists of lists. Import pandas library import pandas as pd # initialize list of lists data = [[‘tom’, 10], [‘nick’, 15], [‘juli’, 14]] # Create the pandas DataFrame df = pd.DataFrame(data, columns = [‘Name’, ‘Age’]) # print dataframe. df Output:
Ans: Pandas DataFrame is two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. Pandas DataFrame consists of three principal components, the data, rows, and columns. Creating a Pandas DataFrame- n the real world, a Pandas DataFrame will be created by loading the datasets from existing storage, storage can be SQL Database, CSV file, and Excel file. Pandas DataFrame can be created from the lists, dictionary, and from a list of dictionary etc. Dataframe can be created in different ways here are some ways by which we create a dataframe: Creating a dataframe using List: DataFrame can be created using a single list or a list of lists. # import pandas as pd import pandas as pd # list of strings lst = [‘Geeks’, ‘For’, ‘Gee
Ans: Pandas Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.). The axis labels are collectively called index . Pandas Series is nothing but a column in an excel sheet. Creating a Pandas Series- In the real world, a Pandas Series will be created by loading the datasets from existing storage, storage can be SQL Database, CSV file, and Excel file. Pandas Series can be created from the lists, dictionary, and from a scalar value etc. Series can be created in different ways, here are some ways by which we create a series: Creating a series from array: In order to create a series from array, we have to import a numpy module and have to use array() function. # import pandas as pd import pandas as pd # import numpy as np import numpy as np # simple array data = np.array([‘g’,’e’,’e’,’k’,’s’]) ser = pd.Series(data) print(se
Ans: A Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). It has to be remembered that unlike Python lists, a Series will always contain data of the same type. Let’s see how to create a Pandas Series from Dictionary. Using Series() method without index parameter.
Ans: Categorical are a pandas data type corresponding to categorical variables in statistics. A categorical variable takes on a limited and usually fixed, number of possible values (categories; levels in R). Examples are gender, social class, blood type, country affiliation, observation time or rating via Likert scales. All values of categorical data are either in categories or np.nan. The categorical data type is useful in the following cases: A string variable consisting of only a few different values. Converting such a string variable to a categorical variable will save some memory, The lexical order of a variable is not the same as the logical order (“one”, “two”, “three”). By converting to a categorical and specifying an order on the categories, sorting and min/max will use the logical order instead of the lexical order, As a signal to other Python libraries that this column should
Ans: A time series is an ordered sequence of data which basically represents how some quantity changes over time. pandas contains extensive capabilities and features for working with time series data for all domains. pandas supports: Parsing time series information from various sources and formats Generate sequences of fixed-frequency dates and time spans Manipulating and converting date time with timezone information Resampling or converting a time series to a particular frequency Performing date and time arithmetic with absolute or relative time increments
Ans: pandas.Series.copy Series.copy( deep=True ) pandas.Series.copy. Make a deep copy, including a copy of the data and the indices. With deep=False neither the indices or the data are copied. Note that when deep=True data is copied, actual python objects will not be copied recursively, only the reference to the object.
Ans: This library is written for the Python programming language for performing operations like data manipulation, data analysis, etc. The library provides various operations as well as data structures to manipulate time series and numerical tables.
Ans: There are various features in pandas library and some of them are mentioned below Data Alignment Memory Efficient Reshaping Merge and join Time Series
Ans: Re-indexing means to conform DataFrame to a new index with optional filling logic, placing NA/NaN in locations having no value in the previous index. It changes the row labels and column labels of a DataFrame.
Ans: There are two data structures supported by pandas library, Series and DataFrames. Both of the data structures are built on top of Numpy. Series is a one-dimensional data structure in pandas and DataFrame is the two-dimensional data structure in pandas. There is one more axis label known as Panel which is a three-dimensional data structure and it includes items, major_axis, and minor_axis.
Ans: Pandas Series is a one-dimensional labelled array capable of holding data of any type (integer, string, float, python objects, etc.). The axis labels are collectively called index. Pandas Series is nothing but a column in an excel sheet.
Ans: Pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. pandas is free software released under the three-clause BSD license.
Ans: Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python .
EmergenTeck's expert trainers guide you from automation basics to enterprise-grade bot development. Hands-on projects, live sessions, and placement support included.
Preparing for multiple tools? Browse our full library of expert Q&A guides.